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t^J- ■ Abstract 



We consider the nonparametric regression estimation problem of recovering an unknown 
response function / on the basis of spatially inhomogeneous data when the design points follow a 
known compactly supported density g with a finite number of well separated zeros. In particular, 
we consider two different cases: when g has zeros of a polynomial order and when g has zeros of an 
exponential order. These two cases correspond to moderate and severe data losses, respectively. 
We obtain asymptotic minimax lower bounds for the global risk of an estimator of / and construct 
adaptive wavelet nonlinear thresholding estimators of / which attain those minimax convergence 
rates (up to a logarithmic factor in the case of a zero of a polynomial order) , over a wide range 
of Bcsov balls. 

The spatially inhomogeneous ill-posed problem that we investigate is inherently more difficult 
than spatially homogeneous problems like, e.g., deconvolution. In particular, due to spatial 
irregularity, assessment of minimax global convergence rates is a much harder task than the 
derivation of minimax local convergence rates studied recently in the literature. Furthermore, 
the resulting estimators exhibit very different behavior and minimax global convergence rates in 
comparison with the solution of spatially homogeneous ill-posed problems. For example, unlike 
in deconvolution problem, the minimax global convergence rates are greatly influenced not only 
by the extent of data loss but also by the degree of spatial homogeneity of /. Specifically, even 
if 1/g is not integrable, one can recover / as well as in the case of an equispaced design (in 
terms of minimax global convergence rates) when it is homogeneous enough since the estimator 
is "borrowing strength" in the areas where / is adequately sampled. 

Keywords: Adaptivity, Besov spaces, inhomogeneous data, minimax estimation, nonpara- 
metric regression, thresholding, wavelet estimation. 
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1 Introduction 



Applicability of majority of techniques for estimation in the nonpar ametric regression model rests on 
the assumption that data is equispaced and complete. These assumptions were mainly adopted by 
signal processing community where the signal is assumed to be recorded at equal intervals in time. 
However, in reality, due to unexpected losses of data or limitations of data sampling techniques, 
data may fail to be equispaced and complete. To this end, we consider the problem of recovering an 
unknown response function / 6 L 2 ([0, 1]) on the basis of irregularly spaced observations, i.e., when 
one observes yi governed by 



where Xi G [0, 1], i = 1,2, ... ,n, are fixed (non-equidistant) or random points, i = 1,2, ... ,n, 
are independent standard Gaussian random variables and a 2 > (the noise level) is assumed to be 
known and finite. Model (jl.ip can be viewed as a problem of recovering a signal when part of data 
is lost (e.g., in cell phone use) or unavailable (e.g., in military applications). Model (jl.ip is also 
intimately connected to the problem of missing data since points Xj, i = 1, 2, . . . , n, can be viewed as 
the remainder of ./V equidistant points j/N, j = 1, 2, . . . , N, after observations at (N—n) points have 
been lost. However, there is a great advantage in treating the missing data problem as a particular 
case of a nonparametric regression problem: with the last decade seeing tremendous advancement 
in the field of nonparametric statistics, a nonparametric regression approach to incomplete data 
brings along all the modern tools in this field such as minimax rates of convergence, Besov spaces, 
wavelets and adaptive estimators. 

The problem of estimating an unknown response function in the context of wavelet thresh- 
olding in the nonparametric regression setting with irregular design has been now addressed by 
many authors, see, e.g., Hall and Turlach (1997), Antoniadis and Pham (1998), Cai and Brown 
(1998), Sardy et al. (1999), Kovac and Silverman (2000), Pensky and Vidakovic (2001), Brown 
et al. (2002), Zhang et al. (2002), Kohler (2003) and Amato et al. (2006). Several tools were 
suggested for attacking the problem; here, we shall review only few of them. For instance, the 
procedure of Kovac and Silverman (2000) relies upon a linear interpolation transformation R to the 
observed data vector y = (2/1,2/2, ■ ■ ■ ,y n ) that maps it to a new vector of size 2 J {2 J ~ 1 < n < 2 J ), 
corresponding to a new design with equispaced points. After the transformation, the new vector 
is multivariate normal with mean Rf and covariance matrix which is assumed to have a finite 
bandwidth, so that the computational complexity of their algorithm is of order n. Cai and Brown 
(1998) attacked the problem by using multiresolution analysis, projection and wavelet nonlinear 
thresholding while Sardy et al. (1999) applied an isometric method. Pensky and Vidakovic (2001) 
estimated the conditional expectation E(Y|X) directly by constructing its wavelet expansion, while 
Amato et al. (2006) applied a reproducing kernel Hilbert space (RKHS) approach in the spirit of 
Wahba (1990). However, until very recently, all studies have been carried out under the assumption 
that the nonequispaced design still possesses some regularity, namely, the density function g of the 
design points x^, i = 1,2, ... ,n, is uniformly bounded from below, i.e., inf^^i] g(x) > c for some 
constant c > 0. In this case, asymptotically, model (jl.ip is equivalent to the case of the standard 
(equispaced) nonparametric regression model, as long as the design density function g is known 
(see, e.g., Brown et al. (2002)). 

Recently, an attempt has been made of more advanced investigations of the problem. Kerky- 
acharian and Picard (2004) introduced warped wavelets to construct estimators of the unknown 
response function / under model (jl.ip when the design density function g has zeros of polynomial 
order. They, however, measured the error of their suggested estimator in the warped Besov spaces 
which is, practically, equivalent to measuring the error of the estimator at the design points only. 
For this reason, the derived estimators posses the usual asymptotical (as the sample size increases) 



Vi = f(xi) + a^i, « = 1,2, ... 



n 
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minimax rates of convergence which do not depend on the order of the zeros of the design density 
function g. This line of investigation was continued by Chesneau (2007a, 2007b) who constructed 
asymptotic minimax lower bounds over a wide range of Besov balls, under the assumption that the 
design density function g is known and that 1/g is integrable, and, furthermore, suggested adaptive 
wavelet thresholding estimators for the unknown response function /. However, in both Kerky- 
acharian and Picard (2004) and Chesneau (2007a, 2007b), the assumptions on the design density 
function g are restrictive enough so that the asymptotical minimax rates of convergence of any 
estimator coincide with the asymptotical minimax rates of convergence under the assumption that 
g is bounded from below, i.e., the corresponding nonparametric estimation problem is a well-posed 
problem. 

Ga'iffas (2005, 2006, 2007, 2009) seems to be the only author who considered the problem as 
an ill-posed problem. He studied local minimax rates and constructed locally adaptive estimators on 
the basis of local polynomials. Ga'iffas (2005, 2007) constructed pointwise estimators of a regression 
function when 1/g is not integrable and showed that convergence rates of the estimators are slower 
than in the case when g is bounded below, hence, demonstrating that the problem of regression 
estimation under irregular design is an ill-posed problem. The shortcoming of his work is that the 
minimax rates of convergence are expressed in a very complex form which is very hard to obtain 
for a regression function / which belongs to a standard functional class. Also, his techniques are 
intended for local reconstruction and depend on cross-validation at each point, so that they become 
too involved when applied to the whole domain of function /. 

The objective of the present paper is to study how zeros of the design function g affect recon- 
struction of regression function / globally. As we show below (see Remark [2]), assessing minimax 
global convergence rates is a much harder task than assessing minimax local convergence rates. 
Model (jl.ip can be viewed as the spatially inhomogeneous ill-posed problem which is inherently 
more difficult to than spatially homogeneous problems, e.g., deconvolution, especially, the case when 
the true regression function is spatially homogeneous. To the best of our knowledge, so far, there 
are no results for asymptotic rates of convergence in the case of spatially inhomogeneous ill-posed 
problem when its solution is spatially homogeneous since the authors usually avoid the problem by 
restricting their attention to the case when the estimated function is spatially inhomogeneous, or, 
at most, belong to a Sobolev ball (see, e.g., Hoffmann and Reiss (2008)). 

In what follows, we address these issues. In particular, we mainly consider two different 
cases: when some fractional power of 1/g is integrable (zero of a polynomial order) and when no 
fractional power of g is integrable (zero of an exponential order). We obtain asymptotic minimax 
lower bounds for the global risk of an estimator of / and construct adaptive wavelet nonlinear 
thresholding estimators of / which attain those convergence rates (up to a logarithmic factor in the 
case of a zero of a polynomial order), over a wide range of Besov balls. Due to spatial irregularity, 
the estimators exhibit very different behavior and minimax convergence rates in comparison with 
the solution of spatially homogeneous ill-posed problem (see Remark [3]). Specifically, even if 1/g is 
not integrable, one can recover / as well as in the case of an equispaced design (in terms of minimax 
convergence rates) when the function is homogeneous enough since the estimator is "borrowing 
strength" in the areas where / is adequately sampled. These features lead to a different structure 
of estimators of / described in Section SJ The complementary case when 1/g is integrable has been 
partially handled by Chesneau (2007a) who showed that the problem is well-posed (i.e., data loss 
does not affect convergence rates) when / is spatially homogeneous. The complementary case when 
1/g is integrable which has been partially handled by Chesneau (2007a) is handled in Section [71 
In depth discussion of the differences of the spatial features of spatially inhomogeneous ill-posed 
problem studied in this paper is presented in Section [8l 

We limit our attention only to the case of an L 2 -risk since the consideration of a wider class 
of risk functions will make the exposition of our work even longer; all results, however, obtained 
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can be extended to the case of L u -risks, 1 < u < oo. 

The rest of the paper is organized as follows. Section discusses the formulation of the 
nonparametric regression estimation problem in the cases of moderate and severe data losses. In 
Section O we derive the asymptotical minimax lower bounds for the L 2 -risk over a wide range of 
Besov balls. Section [J] talks about estimation strategies when 1/g is not integrable, in particular, 
about partitioning the unknown response function / and its estimator into the zero-affected and 
zero-free parts. Section [5] elaborates on the estimation of the zero-affected and the zero-free parts, 
and is followed by Section [6] which discusses the choice of adaptive resolution level and derives the 
asymptotical minimax upper bounds for the L 2 -risk in the case when 1/g is not integrable. Section 
[7] studies complementary case when g has zeros but 1/g is still integrable. Section [8] concludes 
the paper with a discussion. Finally, Section [9] contains the proofs of the statements in the earlier 
sections. 

2 Formulation of the problem 

Consider the nonparametric regression model (jl.ip . Since the noise level is assumed to be known 
and finite, without loss of generality, we set a = 1. Therefore, from now onwards, we work with 
observations yi governed by equation (jl.ip where / E L 2 ([0, 1]) is the unknown response function 
to be recovered, Xi E [0, 1], i = 1, 2, . . . , n, are random design points with the underlying density 
function g, and £j, i = 1, 2, . . . , n, are independent standard Gaussian random variables, independent 
of Xi, i = 1,2, ... ,n. Furthermore, we assume that the design density g is known and has a 
finite number of well separated zeros on [0, 1]. The last assumption is motivated by the following 
considerations. If g vanishes on an interval of non-asymptotic length, then consistent estimation of 
/ is impossible. Also, the zeros of g have a concentration point only in the case when g is highly 
oscillatory, which is not a very likely scenario. Finally, the assumption that g has low values on 
a part of its domain but is still separated from zero is not an interesting case to consider, since 
the lower bound on g will appear in the constant of the well-known expressions for the minimax 
convergence rates (see, e.g., Tsybakov (2008)). 

Note that the above assumptions are not restrictive. If the noise level a is unknown, it can 
be easily estimated with parametric precision using observations in the region where g is separated 
from zero. The assumption that the design points Xi, i = 1,2, ... ,n, are random is not confining 
either. In fact, with small modifications of the theory below, one can consider fixed points < 
x\ < X2 < ■ ■ ■ < x n < 1, generated by an increasing continuously differentiable function G such 
that G(0) = 0, G(l) = 1 and G{xi) = i/n, i = 1,2, ... ,n. Then, the function G plays the role of a 
"surrogate" distribution function with density function g; the design points Xi, i = 1,2, . . . ,n, can 
be then obtained as Xi = G _1 (i/n), i = 1,2, ... ,n. 

Moreover, since the design density g is known and has only finite number of zeros which 
are well separated, one can partition interval [0, 1] into subintervals in such a manner that each 
subinterval contains only one zero of g. For this reason, without loss of generality, we assume that 
g has only one zero x$ E (0, 1) and the following conditions hold. 

Assumption A. Let the design density function g be a continuous function on the interval [0, 1] 
with g{xo) = 0, xq E (0, 1). Then, there exists constants a E M, b > (a > if b = 0), (3 > and 
C g > such that, for x, x + xq E [0, 1], 



If 6 = 0, we shall say that xq is a zero of polynomial order. If b > 0, we shall say that xq is 
a zero of exponential order. Observe that (12. 1|) implies that there exist some absolute constants 
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Cgi < Cg < C g 2 such that for any x, with x, x + xq G [0, 1] and xq G (0, 1), one has 

g(x + x)< C g2 \x\ a exp(-6|x|- /3 ), g(x + x) > C gl \x\ a exp(-b\x\~ p ). (2.2) 

Note that the two cases in Assumption A correspond to the situations of moderate (6 = 0) and 
severe (6 > 0) data losses, respectively. Chesneau (2007a) showed that in the case of a moderate loss 
with < a < 1 (i.e., when 6 = and 1/g is integrable) and a spatially homogeneous function, the 
unknown response function / can be estimated with the same asymptotical minimax convergence 
rates under the L 2 -risk as in the case of a = 0; hence, in this case, the nonparametric regression 
estimation problem turns out to be a well-posed problem. 

We shall be, therefore, mainly interested only in the complementary situation when 1/g is 
not integrable: (i) moderate losses (i.e., 6 = 0) with a > 1 and (ii) severe losses (i.e., 6 > 0) 
with a G M and /3 > 0. As we shall see below, usually in those cases, the asymptotically optimal 
estimation procedures yield estimators with lower convergence rates than in the case of equispaced 
observations, so that the corresponding nonparametric regression estimation problem under model 
(11. lj) becomes ill-posed (see Remark [T]), with the degree of ill-posedeness growing as a > 1 increases 
when 6 = or as /3 > increases when 6 > 0. 

In what follows, we use the symbol C for a generic positive constant, independent of the 
sample size n, which may take different values at different places. 

Remark 1 (Risk functions and design) . As indicated above, we shall measure the precision of 
any estimator f n of / by its L 2 -risk, i.e., 

A(/ n )=E||/ n -/|| 2 . 

If the design points Xi G [0, 1], i = 1, 2 . . . , n, are treated as fixed (i.e., non-random), then the above 
risk, evaluated at the equispaced design {i/n}, i = 1, 2, . . . , n, corresponds to 

1 n 

A d (/n) = -J>[/n(i/n) - f(i/n)]\ 
i=i 

and leads to an ill-posed nonparametric regression estimation problem. However, it is instructive 
to note that if one measures the precision of an estimator f n at the design points xi G [0,1], 
i = 1, 2, . . . , n, only, by calculating 

1 n 

A% xed (f n , Xi ) = -J^nUxi) - f( Xl )} 2 , 

n r=i 

as it was done in, e.g., Amato et. al (2006), then the problem ceases to be ill-posed. Moreover, 
in this case, no special treatment is necessary to account for the irregular design. To confirm that, 
note that equation (11.11) can be re-written as 

Vi = F(i/n) + i = l,2,...,n, (2.3) 

where F(x) = /(G _1 (x)), x G [0, 1]. Construct now an estimator F n of F using, e.g., any of the 
standard wavelet thresholding techniques, and set f n (x) = F n (G(x)), x G [0, 1]. Then, 

F n (x) = f n {G-\x)), XG[0,1], 

and Aj ixed (f n ,Xi) takes the form 

1 n 

Af ixed (fn,Xi) = -J2nFn(iM - F(i/n)} 2 . 
i=i 
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Therefore, if the observed data vector y = (2/1,3/2 ; • • • iUn) is treated as if the measurements were 
carried out at equispaced design points, then, by using, e.g., available wavelet denoising algorithms, 
the resulting estimator F n of function F will be adaptive and it will lead to the smallest possible 
risk Aj ixerf (/ n , Xi). This phenomenon was noticed earlier by Cai and Brown (1998), Sardy et al. 
(1999) and Brown et al. (2002). 



Remark 2 (Local versus global convergence rates). The problem of estimating / globally is 
a much more difficult task than estimating / locally, say at a given point a. Indeed, if G is known, 
then F(G(a)) = f(a) and, hence, one can estimate F at the point G(a) instead of estimating / at 
the point a, where F(x) = /(G _1 (x)), x G [0,1], and F is uniformly sampled (see (|2.3p ). Hence, 
pointwise estimation can be reduced to a well- addressed pointwise regression estimation problem. If 
g(a) 7^ 0, then the problem is well-posed and has been extensively studied before. If, instead, a = xq 
is a zero of g, then one can deduce minimax pointwise convergence rates directly from considerations 
of Remark [1] and straightforward calculus. Let, for simplicity, xq = and g(x) = ax", so that 
G{x) = x a+1 and G~ 1 (x) = x l ^ a+l \ x G [0, 1]. Let / satisfy a Holder condition of order s at xq, 
i.e., |/(x) — /(xo)| < C\x — xq\ s . Then, since xq = 0, F{x) = f(G- x (x)), x G [0, 1], satisfies a Holder 
condition of order s' = s/(a + 1) at 0, i.e., 

\F(x) - F(0)\ = \f(G~ 1 (x)) - f(G-\x ))\ < C\G-\x) - G~\x )\ s = C\x - x \ s l^ +l \ 

Since /(xo) = -^(0), one can set /(xo) = -^(0) and obtain minimax pointwise convergence rates for 
/(xo), on noting that 

E||/>o) - /Mil 2 = E||F(0) - F(0)|| 2 = O Lr^A = O (n-^+r) , 

which coincides with the minimax pointwise convergence rates obtained by Gaiffas (2005). The 
whole argument here rests on the fact that /(xo) = F(G(xq)), xq G [0, 1], so one can estimate F 
instead of / at the respective point. This, however, cannot be accomplished when global estimation 
procedure is required since, in such Taylor expansion is needed, that can be applied only 

locally. 



3 Minimax lower bounds for the L 2 -risk over Besov balls 

Before constructing an estimator of the unknown response function / under model (jl.ip . we first 
derive the asymptotical minimax lower bounds for the L 2 -risk over a wide range of Besov balls. 

Among the various characterizations of Besov spaces for / G L p ([0, 1]) in terms of wavelet 
bases, we recall that for an r-regular multiresolution analysis (see, e.g., Meyer, 1992, Chapter 2, pp 
21-25), with < s < r, and for a Besov ball Bp (A) defined as 



B 9 p , q (A) 



/G^([0,1]): £ 



+ 



k=0 



f G B s p>q , || 


/Ik, < a}, 




= s + 1/2 


-i/p, 




' 00 




q/p\ 












I 



1/'/ 



< A 



(3.1) 
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with respective sum(s) replaced by maximum if p = oo and/or q = oo, where s' = s + 1/2 — 1/p 
(see, e.g., Johnstone et. al (2004)). We study below the L 2 -risk over Besov balls Bp (A) defined 

as 

Rn(B s p>q (A)) = inf sup E||/„-/|| 2 , 

/» /6B-,(A) 

where is the L 2 -norm of a function /i defined on the unit interval, and the infimum is taken over 
all possible square-integrable estimators (i.e., measurable functions) f n of / based on observations 
Hi from model 

The following statement provides the asymptotical minimax lower bounds for the L 2 -risk. 

Theorem 1 Let 1 < p, q < oo and s > max(l/p, 1/2). Then, under Assumption A (with a > if 
6 = 0, and a£R and (3 > if b > 0), as n — » oo ; 



i?n(f&,(A)) > <! 



( 2s 

Cn 2s +! if 6 = 0, as < s , 

C n -2^ if 5 = , as>s', (3.2) 

k C(lnn) _2 3" if b > 0. 



Remark 3 (Global convergence rates). As we shall show below, the minimax global conver- 
gence rates in Theorem Q] are attainable for b > and are attainable up to a logarithmic factor for 
6 = 0. If as = s', the minimax global convergence rates in the first and second parts of (|3.2p coincide. 
Theorem Q] implies that, whenever as < s' , the problem is not ill-posed, in a sense that the minimax 
global convergence rates are the same as in the case of an equispaced design. For a > 1, this relation 
can take place only if 2 < p < oo, i.e., when the function is spatially homogeneous. In particular, 
as < s' holds true for any a such that 1 < a < l + s _1 (1/2 — 1/p), i.e., when / is very homogeneous 
spatially (jp is large, in particular, when p > 2/[l — (a — l)s] provided 1 < a < 1 + 1/s), so that even 
a relatively severe data loss does not lead to the reduction of minimax global convergence rates. If 
< a < 1, then the problem is always well-posed whenever / is spatially homogeneous (p > 2) and 
also when / is spatially inhomogeneous (1 < p < 2) and < a < 1 — (1/p — 1/2)/ s. Therefore, 
even if / is spatially inhomogeneous, the problem is well-posed whenever data loss is very limited 
(0 < a < 1 - (1/p- l/2)/s). 



4 Estimation strategies when \jg is not integrable 

We consider a scaling function (p* and a mother wavelet ip* that generate an orthonormal wavelet 
basis in L 2 (]R), as those obtained from, e.g., an r-regular multiresolution analysis of L 2 (R), for some 
r > 0. We shall also assume that (p* and ip* are both compactly supported, with integer bounds on 
their supports so that, for some L^*, U v *, L^*, U^* 6 Z, with L v * < U^*, L^p* < U^,*, 

supp(v3*) = [LpijUp*], supp(^*) = [L^*,UtP*], L v * < 0, U v * > 0, U v * - L v * > 4. 

(For instance, the Daubechies or Symmlets scaling functions <p* and mother wavelets ip*, with 
filter number (number of vanishing moments) iV > 3, satisfy (|4.2|) with L^* = 0, U v * = 2N — 1, 
L r =1- N and U r = N, see, e.g., Mallat (1999), Section 7.2.) 

We then obtain a periodized version of the wavelet basis on the unit interval, i.e., for j > 
and k = 0, 1, . . . , 2 3 — 1, as 

<M*) = E 2j/ V(2 J (* + 0-^ ^(x) = ^2^V(2^(x + i)-fc), xG [0,1], 
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so that, for any m > 0, the set 

Wmk, 4>jk ■ j > m, k = 0, 1, . . . , 2 J - 1}, 

where 

ip mk (x) = 2 m l\{2 m x - k), i, jk {x) = Vl 2 i>{Vx - k), x G [0, 1], 

forms an orthonormal wavelet basis for L 2 ([0, 1]) (see, e.g., Mallat (1999), Theorem 7.16). Hence, 
for any m > 0, any / G L 2 ([0, 1]), can be expanded as 

2 m -l oo 2^-1 

f( x ) = ^2 a mkVmk{x) + ^2 ^2 b jk 1p jk (x), X G [0,1], (4.1) 
k=0 j=m k=0 

where 

a mk = f(x)ip mk (x)dx, k = 0, l,...,2 m - 1, 

b jk= f{x)ip jk (x)dx, j>m, k = 0, 1,. . . ,2 J ! - 1. 

Denote by L v ,U v ,L^p and £/,/, the support bounds of the periodic scaling function 99 and 
mother wavelet ip. Note that the supports of tp* mk and y> m fc coincide if and only if 2 m > U v * — L v *, 
and, similarly, the supports of ip* k and ipj k coincide if and only if 2 m > U^* — L^* . Choose the lowest 
resolution level mi such that 2 mi > max (U v * — L^*, U^* — L^*) , so that supports of periodic and 
non-periodic wavelets coincide. In this case, we obtain that 

Ly* = L<p, U v * = 11^, L^p* = L^p, U^p* = Ujf,, L v < 0, U v > 0, U v — L v > 4. (4.2) 

For any integer I > 1, denote koi = 2 1 xq. At each resolution level, we partition the set of all 
indices into the indices which are zero-affected and zero-free. In particular, let -K^m anc ^ ^tj ^ e 
the sets such that, for any integer m > mi and j = m, m + 1, . . ., 

# m = {k:0<k<2 m -l,L v -Kk 0rn -k<U (p + l}, 
Kqj = {k : < k < 2 j - 1, L$ - 1< k oj - k < + 1} 

and let 

Kt mc = {k:0<k<2 m -l,k^ Kg m } , JfJ c = {fc : < k < 2™ - 1, ^ <} • 

Simple calculations yield that k G -f^Qmc an d & £ -^"ojc i m ply that xq g" supp ip mk and xo supp if)j k , 
respectively, so that the sets KQ mc and K^- c are zero-free while the sets K^ m and K^- are zero- 
affected. 

With the above notation it is easy to see that, for any m > mi, / can be partitioned as the 
sum of zero-affected and zero-free parts, i.e., 



f(x) = fo,m(x) + f c ,m(x), X G [0, 1] 



where 



fo,m(x) = ^2 a mk (p mk (x) + ^2 XT b jk ip jk (x), xe [0,1], (4.3) 

*e*L ^= m keK^ 

oo 

fc,m{x) = ^ ^ 0\. mk ip mk 

( X ) + Y1 Yl b ^jk(x), xG[0,l]. (4.4) 



c 
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We then construct estimators fo, m and / C)jn of /o, m and f c ,m, respectively, and estimate / by 

fra(x) = f ,m(x) + f c , m {x), X G [0, 1]. (4.5) 

(We emphasize the unusual feature in the construction of f m : as we shall see below, /o, m is a linear 
estimator while f C)m is a non-linear estimator with the lowest resolution level m determined by the 
linear part /o, m .) 

By observing that for any function u(x) we have 

and setting u(x) = (p m k(x) and u(x) = ipj k (x), x G [0, 1], in turn, similarly to Chesneau (2007a), we 
estimate a mk , k G -K"o mc , an d bj k , k £ Kqj c , respectively, by 

n g{xi) umc J n ^ g(xi) UJC 

Note that since 1/g is not integrable, the estimators (I4.6P would have infinite variances if k £ 
or fc G ifjy, so that one cannot construct an estimator of /o, m by direct estimation of wavelet 
coefficients. In this case, we shall use a linear estimator with the lowest resolution level m estimated 
from the data. In what follows, we shall consider the estimation of /o >m and / c . m separately. 



5 Estimation of the zero-free and the zero— affected parts. 

In order to estimate f c , m , we construct a wavelet nonlinear thresholding estimator fc.m as 



J-i 



fc,rn{x) — ^ ] CLmk^mk 

(x) + } } b jk ip jk (x), mi < m < J — 1, i€ [0,1], (5.1) 



where a mk are given in (14, 6h . J is defined below in (15. 3j) while the coefficients ftj^ are thresholded 
estimators of the wavelet coefficients ftjfc defined as 

~ = f 6 ifc I(% > d 2 ™- 1 Inn2^ |fc - fc 0i |- Q ) if 6 = 0, 
ifc j 6j- fc - fctyl > 2 J " m ) if ft > 0. 1 ' j 

Here, <i > is a constant, ftj/% are defined by (|4.6p and m is such that mi < m < J — 1, where 

7 f (Wlnr>W( a+1 ) if ft = 
2- = max (CV - V> ^ - ^) + 1, 2 J = | ™ * J ^ U, (g _ 3) 

Now, consider estimation of the zero-affected part. Since the estimators a mk of a m fc, given in 
(|4.6p . have infinite variances when G if^, we estimate those coefficients by solving a system of 
linear equations. Note that there is a finite known number of indices in Kq tii , at most, = U^ — L^ 
indices. For a given m G [mi, J — 1], denote 

2 m -l oo 2-j-l 

/m(a:) = ^2 a rriWmk{x), E m {x) = ^ ^ b jk 1p jk (x), X G [0,1], (5.4) 
fc=0 j=m k=0 
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and observe that f(x) = f m (x) + e m {x), so that 



^ a 'tnWmk{x) = f m (x) - £ m {x) - a mk ip mk (x), X £ [0, 1]. 



(5.5) 



< 5 b < 1/2, Lp(L^ + 5 b ) ^ 0, tp(U v - 6 b ) / 0, if b > 0, 
0, if b = 0. 



(5.6) 



Denote Qg = [L v + 5^,11^ — 5 b ], and choose S b such that 
5 b = 

Introduce also a finite set of indices 

K* Qm = {k : < k < 2 m - 1, 2L^ — U v < k 0m -k <L lf) or C/^ < fc m - k < 2U V - L v ] . (5.7) 

Now, multiply both sides of formula (|5.5p by g{x) (p m i(x) I(2 m x — I £ Q$), I £ K^ m , where I(x £ ft) 
is the indicator of set Q, and integrate. As a result, obtain the following system of linear equations 

^(m) u (m) _ c (m) _ £ (m) _ g(m) v (m)_ /g_g\ 
Here, matrices A^ 7 ™) and B^ m ^ and vectors c^ m \ e^ m \ and v^" 1 ) have, respectively, elements 



A ik 


J 




= / 

J 


(m) 
C Z 


=/ 


J 


(m) 


=/ 


Jo 


(m) 
< 


"mi; 



Vmi^j^mii^^x) I(2 m x - I £ n s )dx, k, I £ K^ rn , 



f{x)v ml {x)g{x) l{2 m x - I £ n s )dx, I £ Kl 



0m> 



e m {x)ip m i{x)g(x)l{2 m x - I £ VL s )dx, I £ K% m , 
(m) - n , k (= K* 



L 0m> u fc 



(5.9) 

(5.10) 

(5.11) 

(5.12) 
(5.13) 



Note that the matrices A^ m ) and B^ m ^ are completely known, and also observe that By? ^ only 
if k G K% m , since, for k K^ m , one has <p m k(x)<Pml(x) = 0. 

Since K§ m C K^ mc , it follows from (|5.13|) that components of vector v( m ) can be esti- 
mated by 



v k ■* — d m k, k £ i^omc 



using (|4.6p . We also estimate c, by 



1 - 

-^y^w(^)I(2 m ^ - / £ fif), Z G 



(5.14) 



i=l 



and ignore vector £ in (|5.8|) . thus, replacing (|5.8|) by the following system of linear equations 

A (m) fl (m) = g(m) _ B (m) -(m)_ (5 15 ) 

Since matrix A^ m ) is a positive definite matrix of non-asymptotic size, det(A( m )) /Oand we obtain 
the solution 

ft(m) _ C_4(m)-)-l/g(m) _ g(m) *.(m)\ 
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of the system of linear equations (|5.15p and set a m k = u^™\ k G K$m- Finally, for a given m, we 
set a m k = , k G K^ m , and estimate /o, m by the following wavelet linear estimator 

/o,m(») = a ™kVmk(x), X G [0, 1]. (5.16) 

The following statement provides the asymptotic upper bounds for the bias and the variance 
of the estimator f 0tTn given in (|5.16p . 

Lemma 1 Denote fo,m.( x ) = YlkeKfi a mk'-Pmk{x) and let m = m(n) be a non-random quantity, 
m(n) — y oo as n — > oo. Then, fo,m(x) defined in \5.1 6\) satisfies, as n — > oo, 

||E/ 0>rn - /o,m.|| 2 = O (2~ W ) , E||/ , m - E/ , m || 2 = O (n~ l 2 ma exp(62 m V +1 + 1])) . (5.17) 

Moreover, ifb = 0, then E||/ , m - E/ 0jm || 4 = o(l). 
Define mo to be such that 



2 m o 



n 2s+a if 6 = 0, 



(5.18) 

( r l 2 -(/3+2) lnn )/3 if b > o. 



It follows from Lemma [T] that, if m = mo, the error E||/o. m — /o,m|| 2 of estimator fo >m attains the 
lower bounds in Theorem [TJ Since a, b and /3 in (|5.18p are known, in the case of b > the value 
of mo is known. Therefore, one can selects mo as the lowest resolution level in the estimator of 
the zero- free part (|5,ip . The following lemma demonstrates that the estimator / Cjm , given in (15. ip . 
indeed attains the minimax global convergence rates in this case. 

Lemma 2 Let 1 < p, q < oo and s > max (1/2, 1/p). Then, under Assumption A, with b > and 
a£l, /or t/ie estimator ( 15. ij) mt/i m = mo, as n — )• oo, 

sup E||/ c , mo - / c , mo || 2 < C(mn)~T. (5.19) 

/eB|,,(A) 

Unfortunately, this idea cannot be implemented in the case of 6 = 0. Indeed, though a in 
(|5.18p is known, the value of s' is unknown and, therefore, the estimator /o im with m = mo is not 
realizable if b = 0. In this case, we need to choose resolution level m which approximates mo in 
some sense and then estimate / by f(x) = fo t m(x) + fc,m(x)- The choice of such resolution level is 
a rather difficult task. On the one hand, m should not be too small, otherwise, the linear portion 
of the estimator would have bias which is too large. On the other hand, since /o, m is the linear 
estimator, in order to represent / = /o,m+/c,m adequately, m has to be used as the lowest resolution 
level in f C m( x ). The following lemma provides upper bounds for the risk of the estimator (|5.ip of 
the zero-free part f Cjm when 6 = and shows that the risk contains the component ra™ 1 2 ma , so 
that in order to attain the minimax risk (|3.2p . one needs m < mo with high probability. 

Lemma 3 Let 1 < p, q < oo and s > max (1/2, 1/p), and let Assumption A hold with a > 1. Let 
fcm be given by \5.1\) where the non-random quantity m = m(n) is such that mi < m < J — 1 with 
mi and J defined in \5. 3\) . Let bjk be given by 115.0) . and d > ACd, where Cd is given by 

C d = 8C i ,(C gl )- 1 max(2, 2||/||^, ||/|UM|oo/3, IMU) with C f = [2max(|L^|, \U f \)] a . 

(5.20) 
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Then, for the estimator f c ,m, defined in \5.1\) , as n — » oo, 

O (V 1 2 ma (lnn) 1 ^ 1 ) + n~^i (Inn)' 41 ) if b = 0, as < s', 

O ^n" 1 2 ma (lnn) 1 ^ 1 ) + (lnn)^ if b = 0, as > s' . 

(5.21) 



sup E||/ C)m - /, 



2 



c,m | 



/GB|,,(A) 

Here, 



2s(l+% = l)) 2s'(l+I(a = l)) 

2s + 1 2s' + a \ s 



Moreover, as n — >■ oo, 

E||/ c , m -/ c , m || 4 = o(l). (5.22) 

6 Adaptive estimators and the minimax upper bounds for the L 2 - 
risk when 1/g is not integrable 

In order to construct an adaptive wavelet estimator of / in the case of b = 0, we shall use the 
technique of optimal tuning parameter selection pioneered by Lepski (1990, 1991) and further ex- 
ploited in Lepski and Spokoiny (1997) and Lepski et al. (1997). The idea behind this technique 
is to construct estimators for various values of the tuning parameter in question (m, in our case), 
and then choose an optimal value of the tuning parameter by regulating the differences between the 
estimators constructed with different values of the parameter. 

In particular, if b = 0, for various values of m, we construct versions of the system of equations 
(|5.15p where estimators v( m ) are constructed as before, solve those systems and obtain estimators 
(|5.16p with where a m k = u^ 1 , k E K^ m . Construct an estimator f m of / using formula (|4.5p where 
/o, m and f Ctm are of the forms (15.16P and (15. ip . respectively, and m is the lowest resolution level of 
fcm- The choice of the optimal resolution level is driven by the zero-affected portion of / rather 
than the zero-free portion. For this reason, for any resolution level m > 0, we define a neighborhood 
E m of x as 

E m = {x: 2~ m [m.m(L v ,,L i ,)-U ip ]<x-x <2~ m [max{U v ,U i ,)-L^} (6.1) 

and observe that E m is designed so that supp(/o, m ) C H m , supp(/o, m ) C S m and Ej C H m if j > m. 
For 6 = 0, choose m = rh such that m\ < m < J — 1, where mi and J are defined in ()5.3[) and 

rh = min jm : \\(f m - fj)I(E m )\\ 2 < X 2 2 ja n" 1 Inn for all j, m < j < J - l} , (6.2) 

where A > is a constant to be defined below. For completeness, define 

rh = mo if b > 0, (6-3) 

where tuq is defined in (j5. 18|) . 

The construction of rh for b = is based on the following idea. Note that when rh < too, then 
for to = to, one has 

E||/ m -/|| 2 <2[E||/ m -/ mo || 2 + E||/ mo -/|| 2 ] . (6.4) 

The first component in (|6.4p is small due to definition of the resolution level to while the second 
component is calculated at the optimal resolution level too and, hence, tends to zero at the optimal 
convergence rate (up to a logarithmic factor). On the other hand, if m = to > mo, then there exists 
j > to such that \\(f m — ijM^m)|| 2 > A 2 2^ a n~ l Inn. The following Lemma shows that, if A is large 
enough, the probability of this event is infinitesimally small. 
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Lemma 4 Let 6 = and let and rh be given by the expressions h5.18\) and \6.2\) . respectively. 
Denote 



C X0 = 4^2(U lp -L lp + i), Cxi = C A0 (V2C 92 )- 1 ||(A*)- 1 ||, Cx2 = Cxo ||(A*) _1 B !t 
Let Cd be given by i5.20\) and 



where 



C u = max(C\iC K ,C\ 2 C T ) , 
C T = %C v {C gl y l max (2, 2 



C K 



minmax 16C v Co 2 ||/||oo, 16a, 

a>0 



C\ = max (2C U , C T Cxo) , 



Vlloo/3, |M|oo) , 
oo|M|oo , an n 

— , 100^032, 



00' \\J Woo 



^C lf Cg2\\f\\oo M\f 



3r/ 



(6.5) 
(6-6) 



(6.7) 
(6.8) 

(6.9) 



Here ||/i||oo is the uniform norm of a bounded function h defined on the unit interval, C g \ and C g 2 
are defined by \2.2\) and C v = [2max(|L^,|, |t^|)] a . If A > max (Cxi, Cx2), then, as n — > 00, 



(rh > m ) = O ( n °a ) + O ( n a+1 



(6.10) 



Lemma |4] confirms that indeed m = rh can be chosen as the lowest resolution level in the 
nonlinear portion of the estimator, so that we estimate / by 



f(x) = fo,m(x) + f c ,m(x), x G [0, 1] 



(6-11) 



where fo,m(x) and f c ,m(x) are defined in (|5.16p and (|5.ip . respectively. The following statement 
confirms that the wavelet nonlinear estimator / given by (|6.1ip indeed attains (up to a logarithmic 
factor) the asymptotic minimax lower bounds obtained in Theorem [TJ 



Theorem 2 Let 1 < p, q < 00 and s > max(l/2, 1/p) and Assumption A hold with a > 1 if b = 
and a £ M ifb>0. Let f be the wavelet estimator defined by 16. 11\) with A > max (2C\, Cxi, Cx2) 
in 116. 2\) and d > 2(a + l) _1 (2o; + 3)Cd where Cx is defined in $6. 6\) . Cxi, Cx2 are defined in 116. 5\) . 
and Cd is defined in \5.20\) . Then, as n — >■ 00, 



sup e||/-/ii 2 <<; 

/GS|,,(A) 



2s 2s(l+I(q = l)) 

Cn 2s +! (Inn) 2 S +i 



if 6 = 0, as < s', 

2/(1+11(0 = 1)) r,fs'_ n ^,\ 

^ (Inn) ^ +% s - a>i ; ^ b = 0,as>s', 

{ C(lnn)~T~ if b > 0. 



2a' 

Cn 2^h 



Remark 4 (Adaptivity) Theorems [T] and [2] demonstrate that, for severe data losses (b > 0), the 
adaptive wavelet nonlinear estimator / given by (|6.1ip attains the asymptotically optimal (in the 
minimax sense) global convergence rates. For moderate data losses (6 = with a > 1), however, 
the adaptive wavelet nonlinear estimator / given by ()6.1ip is asymptotically near-optimal up to a 
logarithmic factor. Moreover, if p is large and a > 1 is relatively small (1 < a < (1/2 — l/p)/s), 
data loss does not affect the minimax global convergence rates and they coincide with the minimax 
global convergence rates obtained in the absence of data losses. 
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Remark 5 (Relation to work of GaTffas). Estimation of the "zero-affected" part of / in 
the paper is somewhat similar to the procedure of Ga'iffas with the difference that he used local 
polynomials while we are using wavelets. However, the significant difference is that we use this 
estimator only for the zero-affected part and not for the whole function /. 

Another difference between our and Ga'iffas' studies is that, first, we are able to formulate 
the convergence rates explicitly, in a simple meaningful way, and, due to the fact that we are using 
thresholding of wavelet coefficients rather than solution of the system of equations as in Ga'iffas, our 
estimator can adapt to the case when the estimated function is spatially inhomogeneous. Moreover, 
we should point out that our optimal convergence rates are global and over a wide range of Besov 
balls compare to Gaiffas that are local or uniform and only for Holder spaces. In particular, 
Ga'iffas (2005, 2007) deals only with estimation of / at xq, zero of the design density. The rates of 
convergence of the estimator /(xo) can be expressed explicitly via a and parameters of Holder ball 
where / belongs. However, as we pointed out in Remark [2] , this problem is much easier than the 
global estimation and can be solved by straightforward calculus. 

Ga'iffas (2006, 2009) also studied convergence rates of his estimator using uniform norm. In 
this case, for example, the convergence rates are formulated in terms of a solution of a nonlinear 
equation. The are no explicit expressions for the rates in a general situation. For instance, the 
only example, which appears in Ga'iffas (2009), is produced for the simplest situation when a = 1, 
/ belongs to a Holder class with parameters s = L = 1 and the design density is of the form 
g(x) = 4|x — 1/2 1 , i.e. a = 1. In this case, the convergence rates are given by 



In the cases, when a > 1 or it is not an integer, a solution of the equation which produces the 
convergence rates, as well as derivation of the explicit expression for the rates, require very nontrivial 
investigation. 

7 Estimation and the minimax upper bounds for the L 2 -risk when 
\jg is integrable 

The case when 1/g is integrable, i.e. when g has a zero of a polynomial order a, < a < 1, has 
been considered by Chesneau (2007a) who demonstrated that the problem is well-posed when / 
is spatially homogeneous, i.e., when p > 2. However, the lower bounds in Theorem [1] show that 
the problem becomes ill-posed when as > s', i.e., when 1 < p < [s(l — a) + 1/2]" 1 . Hence, by 
considering only spatially homogeneous regression functions (p > 2), Chesneau (2007a) missed the 
"elbow rate" when / is spatially inhomogeneous and the fact that the problem becomes ill-posed 
in this case. However, since the estimators of the wavelet coefficients (|4.6p have finite variances for 
< a < 1, one can construct an estimator similar to Chesneau (2007a) by simply thresholding 
wavelet coefficients. In particular, set 



r n (x) = (logn/n) a " (a:) 



where 




x e [0,0.5 - (logn/2n) 1 / 4 ), 

x G [0.5 - (logn/2n) 1 /4 ) 0.5 - (logn/^n) 1 / 4 ] 
x E (0.5 + (logn/2rt) 1/4 ,l] 



2 m i-l 



J-l 2-7-1 




(7.1) 



fc=0 



j=m\ k=0 



14 



where a m k and bjk are denned in (|5.2p . and mi and J are denned in (|5.3p with 6 = 0. The following 
statement confirms that estimator (|7.1|) attains (up to a logarithmic factor) minimax lower bounds 
obtained in Theorem [TJ 

Theorem 3 Let 1 < p < oo, 1 < q < oo, s > max(l/2, l/p), and let d in \5. ||) satisfy inequality 

(l-a)(l + a) v ; 

where Cd is given by 15.20\) . Then, under Assumption A, with < a < 1 and 6 = 0, as n — >■ oo, 



/ 2s 2s \ 

O ( n 2s +! (lnn) 2s +! ] 



if as < s', 



sup E||/ c -/f = ^ / _2J ^ +I(a8>s n\ (7.3) 

/eBj i9 (A) O n ^ (lnn)^ + ( - ' if as > s'. 

Theorem [3] shows that for < a < 1, the problem is regular a long as p > [s(l— a) + l/2]~ 1 and 
it becomes ill-posed when p < [s(l — a) + 1/2] _1 . Therefore, even when data loss is very moderate 
(0 < a < 1), the problem becomes ill-posed whenever / is rather spatially non-homogeneous 
(p < [s(l - a) + 1/2]- 1 ). 



8 Discussion 

We considered the nonparametric regression estimation problem of recovering an unknown response 
function / on the unit interval [0, 1] on the basis of incomplete data when the design density g is 
known and has a zero xq G (0,1) of a polynomial or an exponential order. We investigated the global 
estimation (in the minimax sense) of / which is a much harder problem than pointwise estimation 
studied by Gaiffas (2005, 2007) since the problem cannot be reduced to the estimation of a related 
regularly-sampled function (see Remarks [1] and [2]) . 

The problem of global nonparametric estimation of a regression function is ill-posed and, 
moreover, it is spatially inhomogeneous. For this reason, the resulting estimators demonstrate com- 
pletely different patterns of behavior in comparison with spatially homogeneous ill-posed problems 
like, e.g., deconvolution. 

We studied various regimes of data loss from relatively minor (when the design density g has 
a zero of polynomial order a G (0, 1) and, therefore, 1/g is integrable) to moderate (when the design 
density g has a zero of polynomial order a > 1 and, hence, 1/g is not integrable) and severe (when 
the design density g has a zero of exponential order, so that 1/g is not integrable). 

Global convergence rates in the case of minor data losses (0 < a < 1) were studied by Chesneau 
(2007a) who showed that the problem is well-posed (the minimax global convergence rates are the 
same as in the absence of data loss) whenever the regression function / is spatially homogeneous. 
As our study shows, the problem remains well posed even if / is spatially inhomogeneous as long 
as the data loss is very minor (0 < a < 1 — (l/p — 1/2) /s) or the function is relatively smooth 
(p > (1/2 - s(a - l))" 1 ). When a > 1- (l/p - 1/2)/ s (p < (1/2 - s(a - 1)) _1 ), the problem 
becomes ill-posed. 

Now, consider the situation when data loss is moderate (6 = and the zero of g is of a 
polynomial order a > 1). The problem is ill-posed if a > (1/2 — l/p)/s, i.e., it is always ill- 
posed when / is spatially inhomogeneous (1 < p < 2). However, as Remark [3] points out, when 
/ is very spatially homogeneous (p is rather large) and data loss is relatively moderate (1 < a < 
(1/2 — l/p)/s), the problem of estimation of / ceases to be ill-posed and exhibits minimax global 
convergence rates observed when g is bounded from below. Thus, in the case when / is very 
spatially homogeneous, the estimator of / is "borrowing strength" in the areas where / is adequately 
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sampled and exhibits minimax global convergence rates common for regularly spaced regression 
estimation problems. This is very dissimilar to spati ally homogeneous ill-posed problems (e.g., 
deconvolution) where there is a change point in the minimax global convergence rates (the, so- 
called, elbow effect) when the estimated function / E Bp is spatially inhomogeneous (1 < p < 2) 
and they are independent of p when it is spatially homogeneous (2 < p < oo). On the contrary, in 
the case of spatially inhomogeneous ill-posed problems, like the one considered herein, the minimax 
global convergence rates depend on p even when the function is spatially homogeneous (2 < p < oo) 
as long as a > (1/2 — l/p)/s. Thus, the elbow effect occurs when p > 2, in particular, when 
p > 2/(1 - (a - l)s) provided 1 < a < 1 + 1/s. 

In the case when data loss is severe (b > and the zero of g is of an exponential order 
/3 > 0), the minimax global convergence rates grow with p, i.e., the more spatially homogeneous / 
is, the better it can be estimated. This is unlike spatially inhomogeneous ill-posed problems, like 
a deconvolution problem, where the minimax global convergence rates improve when p is growing 
when 1 < p < 2 and are independent of p when / is spatially homogeneous (2 < p < oo). 

The unusual behavior of the minimax global convergence rates in the case of the spatially 
inhomogeneous ill-posed problem considered above calls for different estimation strategies. In par- 
ticular, whenever data loss is moderate or severe, we partition / into zero-affected and zero-free 
parts. First, we construct an adaptive linear wavelet estimator of the zero-affected part where the 
lowest resolution level m = m is independent of the unknown parameters of the Besov balls and, 
therefore, known when b > 0, and is chosen using Lepskii's method when 6 = 0. After that, we 
construct a nonlinear wavelet estimator of the zero-free part of / starting from the lowest resolution 
level m = rh. Note that nonlinear estimator is required even if g has a zero of exponential order 
(b > 0). This is very different from the case of spatially homogeneous ill-posed problems (e.g., 
deconvolution), where in the case of exponentially growing eigenvalues, a linear estimator usually 
attains optimal (in the minima sense) global convergence rates (see Pensky and Sapatinas (2009, 
2010).) 

We should mention that there is a significant difference between minimax local and minimax 
global convergence rates. Note that minimax local convergence rates at zero of g are always affected 
by loss of data, even for moderate data losses. The minimax global convergence rates, however, 
are not affected when data loss is limited and the regression function is very spatially homogeneous 
(1 < a < 1 + 1/s and p > 2/(1 — (a — l)s). Finally, we point out that some of the logarithmic 
factors which appear in Theorems 1,2 and 3 could be removed by using block thresholding rather 
than term-by-term thresholding of wavelet coefficients. 

Furthermore, due to its construction, the suggested wavelet estimator is not easily computable, 
so it is of limited practical use. Therefore, it is desirable to construct an alternative, more compu- 
tational feasible, adaptive estimator which attains the asymptotically minimax global convergence 
rates, that was the aim of this work. This is the project for future work that we hope to address 
elsewhere. 
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9 Proofs 



9.1 Lower bounds 

Proof of Theorem [TJ On noting that the asymptotical lower bound in Theorem 3.1 of Chesneau 
(2007a) is also true when 6 = and 1/g is not integrable (i.e., a > 1), the asymptotical lower bound 
in the first part of (|3.2p can be obtain by the arguments of Chesneau (2007a) and, hence, we need to 
prove only the asymptotical lower bounds in the second and third parts of (|3.2p . For this purpose, 
we consider functions fj k be of the form fj k = Jjtpjk and let /o = 0. Note that by (|3.ip . in order 
fjk G Bp q (A), we need 7j < A2~i s . Set jj = c2 _JS , where c is a positive constant such that c < A, 
and apply the following classical lemma on lower bounds: 

Lemma 5 (Hardle, Kerkyacharian, Picard & Tsybakov (1998), Lemma 10.1). Let V be 

a functional space, and let d(-, ■) be a distance on V . For f,g€V, denote by A n (f,g) the likelihood 
ratio A n (f,g) = dF (f) / dP ( g ) , where dF ( h ) is the probability distribution of the process X n when 
h is true. Let V contains the functions /o, /i, . . . , fn such that 

(a) d(f k ,f k ,)>6>0fork = 0,l,...,H, k^k', 

(b) ft > exp(A n ) for some X n > 0, 

(c) lnyl n (/o,/fc) = u n k — v n k, where v n k are constants and u n k is a random variable such that 
there exists iro > with P f k {u n k > 0) > 7Tq , 

(d) svp k v nk < \ n . 

Then, for an arbitrary estimator f , 

supP x u){d(f,f)> 5/2) >7T /2- 
fev n 

Let now V = {fjk ■ \k — koj\ < K/2}, where K > 2 is a fixed positive constant, so that ft = K. 
Choose d(f,g) = \\f — g\\, where, as before, || • || denotes the L 2 -norm on the interval [0, 1]. Then, 
d(fjk, fo) = 7j = 5. Let v nk = X n = InK and u nk = ln7l n (/ , f jk ) + InK. Now, in order to apply 
Lemma we need to show that for some ttq > 0, uniformly for all fj k , we have 

P fjk {u nk > 0) = P fjk (ln^ n (/ , f jk ) > - InK) > tt > 0. 

Since, by Chebychev's inequality, 

E f \lnA n (f J jk )\ 



V, {\nA n {f Ql f jk )>-\nK)>\ 



InK 



we need to find a uniform upper bound for E/ jfc | ln7l n (/o, fj k )\- 
Note that 

n n 

-2 In A n (f , fjk) = ^2 rf^jkixi) + 2 Y1 ^3^jk{xi)ii 

i=l i=l 

where £j, i = 1, 2, . . . , n, are independent standard Gaussian random variables. Thus, 

E| -2ln A n (f Jjk)\ <A n + 2B n , 

where 

n ,.\ n 

A n = E| ^2jjipj k (xi)\ = nrf I ip* k (x)g(x)dx, B n = E| ^ TjV'ifeC^)^!- 
i=i ^° i=i 
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Note that by Jensen's inequality. 

B n = e|e 

< E < E 



i=i 

n 



•El > -^2 j • • • > %r. 



%1 j 2-2 ) • • - J S'Tl 



1/2 



so that one needs uniform upper bounds for A n only. 
If j is large enough, A n can be presented as 



where k$j = 2 3 xq. Observe that condition (|2.1|) implies that one has 

g(x + x)< C\x\ a exp(-b\x\~ p ). 
Let Mrf = max(|L^,|, |C/^,|). Then, for a finite value of K, one has 

A n < Cn 7 p-i a (M% + K a ) exp (-62^ (Afy + *Q~ 

Now, recall that 7j = c2 _J,s ' and choose the smallest possible value of j such that A n are uniformly 
bounded. Simple calculation yield that 

A n = O (V^ 2s ' +Q ) exp ( - b2^[M^ + K]~^ , 



so that V = O ^ n 1 /( 2s ' +Q )J if b = and 2^ = O ((Inn) 1 ^) if 6 > 0. 

Now, applying Lemma [5] and Chebyshev inequality, we finally obtain 



fn feB*JA) 
which, on noting that 



inf sup E||/„-/|| 2 >inf sup( 7 2 /4) 



fn fev 



f\\ > 7i/2) > ^07?/8, 



< - — — - if and only if s < as, 



2s' + a 2s + 1 
completes the proof of the theorem. 

9.2 Properties of estimators of wavelet coefficients 

Consider the quantity 



(9.1) 



rnkl 



2 m \tp(2 m x - k)ip{2 m x - Z)| g~ l {x) dx. 



(9.2) 



Lemma 6 Let m = m(n) be a nonrandom value and let d m k be defined by Hi4.6\ ). Then, for k, I £ 
K Omc> as co, 

(9.3) 

if \k - l\ < U v - L v , (9.4) 



\Cov(d mk ,a m i)\ =0(n 1 (J mM + 1)) , 



where 



Jmki = 0[n- L 2 ma \k - k 0m \~ a exp(b2 m/3 \k - k 
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and Jmki — otherwise. Moreover, ifb = 0, then, as n — > oo, 

Var{a mk ) = O (n" 1 2 ma \k - k 0m \~ a ) , 
E(a mfe -a mfc ) 4 = O(n- 3 2 m ^ a +^\k-k 0m \- 3a ^j +0{n- 2 2 2ma \k 



«0m| 



Similarly, if k,l € ^(^ c 6 = 0, £/ien 6^, defined in j^.6p , satisfy, as n — > oo, 
Far(6 jifc ) = O (n" 1 2** |fc - fc oi |-<*) , 

- 6 jfc ) 4 = O (n~ 3 2^ 3a+1 )|£; - fcojl" 3 ") + O (™~ 2 2 2ja |A; - k 0j \~ 2a ) , 
E(&ife - b jk f = O (n~ 5 2 j ( 5a+ V\k - k 0j \- 5a ) + O (n~ 4 2 J '( 4a+1 ) \k - k 0j \~ 4a 
+ O (n~ 3 2 3ja \k - k Qj \~ 3a ) . 

Proof of Lemma [6j Let us first prove formula (|9.4p . Changing variables z = 2 m (x — xq) in the 
last integral, and using inequality (|2.2|) . derive that 



) • (9-5) 

(9.6) 
(9.7) 

(9.8) 



Jmki — 



J L \z + k- k Qm \ a exp (-b2 m ?\z + k- k 0m \-P) ' 



\(p(z)\\tp(z + k - l)\ dz 



It is easy to note that J m ki = if \k — l\ > U v — L v . Also, k £ Kqjc i m P nes that ko m — k < L v — 1 
or ko m — k > U<n + 1, so that one has \z + k — ko m \ > 1 and, hence, \z + k — &o TO | oc \k — ko m \ which 
proves (|9.4p . Now, by direct calculations we obtain that 

Cov(a mfc , d ml ) = n" 1 \ / [a 2 + f 2 (x)] <p mk (x)ip m i{x)g~ 1 {x)dx - a mk a m i \ , 



so that (|9.3p is valid. 

Since the proofs for the scaling and the wavelet coefficients in Lemma [6] are similar, we shall 
prove only formulae (|9.6j) - ()9.8|) . Observe that, due to (|2.2p and the fact that k G ^tjc i m P nes 
koj — k < — 1 or koj — k > + 1, by considerations similar to the ones provided above, for 
integers ri,r 2 > 0, one has 

J {g(x)Y r - (^ k (x)) 2ri dx < C2^~^ 2^ a \k - k 0j \~ r * a . (9.9) 

Now, to complete the proof of (|9.6p - (|9.8p . as n — > oo, apply (|9.9p to the following formulae 



V&r(b jk ) 
E(6j fc - * 3 kf 

nhk - b Jk f 



O n" 1 / g-\x)i>] k (x)dx ), 
O U- 5 f g-\x)tf k {x)d. 



9 1 (.x)if; 2 k {x)dx 



ix + n 



9 1 (x)ip] k (x)dx 
+ n~ 4 J g~ 3 (x)tl)j k (x)dx J g' 1 {x)ip 2 k (x)dx 
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9.3 Proofs of the supplementary statements used in the proof of Lemma [0 

Lemma 7 Let 5 = 0.5 3? +1 [2 3 /3+1 + (2M) /3+1 ]" 1 . Let < 6 < 5 and a, 6 G (2 — 5, M] for some 
M > 0. Let c > be such that c < min(a, b) + 5 and c < max(o, b) — (1 — 25). Then, 

a -f> + b ~P - 2c~P < -0.5/3M-^ +1 ). 

Proof of Lemma To show that the lemma is true, note that 5 < 5o implies So < 1/2. Let, 
without loss of generality, a < b. Then, c < a + 5, c < b — (1 — 25) and 

a-P-c-P < a-^ +1 ^5<(2-5)-^(35 

h~P - C -P < b-P - {b - 1 + 2S)~ P < -/3M-(P + V (1 - 25). 

Therefore, taking into account that 2 — 5 > 3/2 and 5 < 5q, we obtain 

a -P + ft-/? _ 2c~P < (2 - $)-V+Vp6 - /3M-^ +1 ) (1 - 25) 

< - ( 0M-^ +1 )(3/2)-^ +1 )[(3/2)^ +1 -<S(M /3+1 +2(3/2y 3+1 )] < -0.5M~^ +1 \ 

which proves the lemma. 



Lemma 8 Let A^\ Bjy , Cj and be given by 115. 9\) , \5. 10\) . 115.11)) and {5.1$ , respectively. 
Then, Varic^ 1 ) = O (n^ 1 A^ J, and for some absolute constants C\ and C2 one has 



d2~ ma exp(-62 /3 ( m+1 )) < A$ < C 2 2~ ma exp(-6M" /3 2 m/3 ), (9.10) 

where M v = U v -L v +xosx{\U v \,\L v \). Moreover, ifb > andO < 5 b < 5 for5 = 0.5 [2 3 /3+l + 
(U v + L^ +1 ]-\ then 



IA (m) l 

' < Cexp (-0.25 b(U v + L v )-W +r > 2 mfi \. (9.11) 



In addition, if b = and mi < m < J — 1, then, as n —> 00, 

||B( m ) || = O (2" ma / 2 ) , E(ci m) - 4 m) ) 4 = O (n" 2 2- 2ma ) . (9.12) 

Proof of Lemma [8l First, note that, by (|5.9p . one has Var(c["^) = n~ l f ipf nk (x)(f 2 (x) + 

If b = 0, then 



ct 2 ) 5 (x) I(2 m x - I e O^cte = O (n -1 A^ 5 ) . 

E(g M_ c M )4 = f-s f^ mk{x)g{x)dx + n - 



i Pmk( x )9(x)dx J 
= O (n" 3 2 m 2" ma + n~ 2 2~ 2mQ ) = O (n" 2 2- 2mQ ) 

since n - 1 2 m ( 1+Q ) < 1 for m < J — 1, which completes the proof of the second half of (I9.12p . Now, 
observe that, as n — > 00, 

Affi = J ip(z + k 0m — k)(p(z + kom — l)g(xo + 2~ m z)I(z + k 0m — I <E Q$)dz (9.13) 

~ C g 2~ ma [ cp(z + k 0m - k)<p{z + k 0m - l)\z\ a exp(-62m/3|z|-/ 3 ) dz, k, I £ #£#.14) 
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and has a similar expression, just with k G K$ m and I G -K"o m , where .Kg m is defined in ([57 
Recalling that 6 = and the quantities | k — ko m | and 1 1 — ko m | are uniformly bounded for k G K^ mc 

and I G i£^ m , obtain (for 6 = 0) that \B^\ = 0( 2_mQ )> so that tlie first statement in ([97T2]) is true 
due to the fact that matrix is finite dimensional. 

Now, let 6 > and let us prove (j9~7TU|h Observe that < z + k 0m - k < U v and k G K% m 
imply \z\ < M v . Hence, the upper bound in (|9.10p follows from (|2.2[) and (|9.13p . In order to prove 
the lower bound in (|9.10|) . note that 

4? > C gl 2~ ma [ V 2 (z)\z- (k 0m -k)\ a exp(-62m/3|z - (k 0m - k)\^) dz 

where Q* s = (L v + S b , (L v + U 9 - l)/2) U ((L v + U v + l)/2, U v - 5 b ) and S b is defined in dSHJ). 
Since |z - (k 0m ~ k)\ > 1/2 for z G fij, and by jf/J), (L^ + - l)/2 - (L v + S b ) > 1 and 
(E/p - 5ft) - (^ + C^p + l)/2 > 1, one has 

/ f (L lp +U lp -l)/2 rU v -S b \ 

4T ^ C 9 i2- Q(m+1) exp(-&2# m+1 >) min / <^ 2 (z)(fe, / ^(^cfe , 

V 7 J(L v +U v +l)/2 J 

which completes the proof of (19.101) . 

Finally, let us prove (|9.1ip . Note that asymptotic value of the integral in (19.13|) is defined 
by the value at a point which maximizes the argument of the exponential function. Recall that 
(see, e.g., Dingle (1973)) if F(X) = h(x) exp(\S(x))dx where maxS(x) is achieved at x = a and 
S(x) is a decreasing function of x, functions f(x) and S(x) are continuous on [a, 6] and infinitely 
differentiable in the neighborhood of x = a with S'(a) ^ 0, then, as A — > oo, F(X) has the following 
asymptotic expression 

oo 

F(X) ~ exp(A5(a)) ^c fe A~ (fc+1) with c k = - D k (h(x) / S' (x)) (9.15) 

k=o 

where D is the differential operator of the form D - 



(l,k,S) 



S'(x) dx- 

It is easy to calculate that exp(— b2m(3\z\~P) takes its maximum value at 
max(us, vs where = max(L^ + k — k 0m , L v + 5 b + l- k 0m ), and v%' k ' = min(C/ v + k — k 0m , U v - 
S b + I — ko m ) and L v < k — ko m ,l — ko m < U v . In what follows, we shall drop the superscripts 
whenever it does not cause confusion. 

First, consider the case of k = I. Then, by examining the cases ko m — I < (L^ + U v )/2 and 
fcom — I > (L<p + separately, one can easily conclude that 

,(l,l,S) _ / \ L f + $b + I ~ k)ml if k 0m - I > (L v + U v )/2, 



7 {hhd) _ J |-"V 1 "0 1 1 '"Umh " "Urn - ^ \^ip i ^ V ) I "t (n i c\ 

max 1 1^-4 + /- fcoml, if kom - i < (L v + U v )/2, { ^ 0) 

where, by (|4.2p . > (C/^, — — 25 b )/2 > 2 — 5 b > 1 in both cases. Hence, since <£>(0max ) / 

by definition of ^6, formula (|9.15p yields 

4 m) ~ C g {bpr\\& + k 0m - l) |zg£?| a 2" m(Q+/3) exp(-62^|^)|^) 

> _f^ 1 2~ m ( a +/ 3 ) exp(-62 m/3 |zW^|- /3 ). (9.17) 

If A; ^ Z, then \k — l\ > 1 and one has four cases, depending on whether ko m — k and ko m — I 
are smaller or greater than (L„ + U ip )/2. We shall consider two of those since the other two cases 

are similar. In what follows, we denote by Zmsx the value of Zmi^ obtained if 5 = 5 b = 0. 
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If k 0m - I < (L^ + U v )/2 and k 0m - k < (L v + U v )/2 then \zm$\ = U v - S b + / - k 0m , 
|^max' 0) | = U v + k - k 0m and, since 5 b < 1/2, 

(l,k,S)\ _ / U v - Sb + I - k 0m , if l>k 



" max 1 U^ + k-k 0m , if Z<fc. 



Therefore, taking into account that |zmax^| = |-Zmax°^| — 5 b , one derives that 

max (\zU£\ \&\) - \&\ > 1 " 25 6 - (9.18) 



Now, consider the case when ko m — I < (L v + U^/2 and ko m — k > (L^ + U v )/2. In this situation, 

\zrkk ■ \ =U^-5b + l- k 0m , |zmax°^| = k 0m - k — L v and |*ffiajc | = max(|C/ v + k- ko m \,\L v + 5 b + 
I — ko m \), so that relation (|9,18p is again true. Cases when ko m — I > {L v + U lfi )/2 can be examined 
in a similar manner and it can be shown that (|9.18|) is valid. 

The asymptotic expression for A^j as m — > oo can be obtained using formula (|9.15p 

4f ~ C g K(<p,b,P,z max ) 2 ~ m ^2~ m ^ exp(-62m/3|z max |-' 3 ), (9.19) 

where K((p,b, ft, z max ) depends on <fi,b,(3 and 2; max only and, hence, uniformly bounded, r* = if 
z max does not coincide with or and r* = tq + 1 if it does. Here, tq is the number of continuous 
derivatives of tp. 

We are now ready to complete the proof of the lemma. Recall that 

< min(|^)|, \z££V\) < min(|*gAf)|, \z££% + 5 b , 

and, by (pT8j) . that 

|^/)| < max(|^)|, \zt*' 5) \) - (1 - 2<5 6 ). 

Since |zmax^| > 2 — ^ and |zmax^| > 2 — <5^,, an application of Lemma with a = |zmax^|, 
b = l^max |) c = l^max I arid M = (L^ + U v> )/2, completes the proof of the lemma. 



Lemma 9 Let A be the matrix with the entries given by \5. 9\) and let D be the diagonal matrix 
D = diag(A). Denote Q = D -1 AD -1 . Then, for any b > one has ||Q _1 || = O(l) as m — > oo. 
Moreover, if b > 0, then Q _1 = I + H, where 

||H|| = O (exp(-0.125 Wg2 m/3 )) , m ^ oo, 
and 5q is defined in Lemma\^ i.e., Q" 1 = 1(1 + o(l)) as m — > oo. 

Proof of Lemma [9], Note that matrix Q is an ([/^ — L^ + l)-dimensional positive definite matrix 
with a unit main diagonal and smaller off-diagonal entries, so that, it has a non-asymptotic bounded 
inverse Q _1 with 1 1 <ZJ 1 1 1 = O(l). 

If b > 0, then = 1, so that Q = I + H. Here, by Lemma [9l H is a finite dimensional 
matrix with elements H ik = O (exp{-0.25 b(U v + L^) - ^" 1 " 1 ) 2 m ^}), as m — > oo. Hence, ||H|| = 
O (expj— 0.25 biJJ^ + -L^) - ^ 3 " 1 " 1 ) 2 m ^}). To complete the proof of the lemma, it suffices to note 
that 



Q 



1 + 



oo 

E( 

k=l 



) k H k where 



fc=i 
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9.4 Proofs of the large deviation results 

Denote 



Q n = n 1 ^ 2 Vlnn. 

In order to prove Lemma 01 we need the following three large deviation results. 



(9.20) 



Lemma 10 Let 6 = 0. Let w(x) be a bounded function with a bounded support {W\, Wb) and a unit 
L 2 -norm. Denote Wj k (x) = 2^ 2 w(2^x — k) and set 

1 w jk {xi)yi 



/1 1> 
w jk (x)f(x)dx, (3 jk = n" 1 ^ 



where f is the unknown response function in model M.l\) . Let C g \ be defined in \2.2}) . and let 

C w = [2max(|Wi|, \W 2 \)]] a , C T = 8C W (C gl )~ l max (2, 2||/|&, H/IUHUA || ^lloo). (9.21) 
Then, for mi < j < J — 1, k £ ^ojc an d T — ^> as n ~ * °°> one ^ as 

F (\kk ~ Pjk\ >ro n 2^ 2 \k - k 0j \~ a/2 ) = O (rC&) . (9.22) 
Here, mi and J are defined in \5. 3\) , g n is defined W. 20\) , and 

Ktf jc = {k: < k < 2 j - 1, x i supp w jk } . 

Proof of Lemma 1101 First, note that a slightly unusual formulation of this lemma is due 
to the fact that we are planning to use it both with w = <p and w = ip. The proof of the lemma is 
based on ideas presented in Chesneau (2007a). Observe that 



where 



n 



n 



\Pjk ~ Pjk\ >TQ n 2^ 2 \k - k 0j \- a/2 ) <Pl + P, 



^2[g(xi)} 1 w jk {xi)f(xi) - j3 jk 



i=i 

n 



> 0.5 r g n 2 ja / 2 \k - k 0j \~ a / 2 



1=1 



> 0.5t g n 2^ 2 \k- k 0j \~ a / 2 . 



The proof of the statement is based on Bernstein inequality 



i=i 



> z 1 < 2 exp 



nz 



2{o 2 + \\r ! \\ 00 z/?>)) ' 

where rji, i = 1,2,... ,n, are i.i.d. with Er/j = 0, Er/j = a 2 and ||?7j|| < ||t?||oo < 00. 
First, let us construct an upper bound for Pi. Note that for k £ K™j c one has 
g{xi)I(xi G suppw jk ) > C" 1 C g i2~ ja \k - k 0j \ a . 



(9.23) 



(9.24) 
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Let rji = [g{xi)\ 1 w jk (xi)f(xi) - f3 jk . Then, Era = and, by (j9.24[) . we derive \\tjWoo < C g ^C w \k 

w * w 2 {tp a 



k 0j \- a 2^ a + 1 / 2 )\\w\\ oo \\f\\ oo , so that 



Var?7j 



2 oja/^i 
oo ^ 



C g i J Wl \t + k-k 0j \ a C g i\k-k 0j \ 



Now, applying Bernstein's inequality and recalling that j < J, 2" ? ( a+1 ) = re/ Inn and \k — koj\ > 1, 
we obtain 



Pi < 2 exp 



C g i T 2 lnn 



8 (7k 



tu J oo J oo 



+ ||w||oo t/6) 



Using the inequality a/(b + c) > min (a/ (26), a/ (2c)) , where a,b,c > 0, and taking into account that 
r 2 > t for r > 1, we obtain 



Pi < 2exp(-rlnn/Z) 1 ) with D x = 8C^C W max(2||/|| 2 <) , ||/|Ulkl|oo/3). 



(9.25) 



In order to construct an upper bound for P 2 , note that, conditionally on (x\, x 2 , • • • , x n ), one 



has 



n 



1 ^(g(xi))- 1 w jk (xi)£i ~ A/"(0, sf fe ), 



i=l 



where, by (|9.24p and a = 1, 



1 A ti&fc) 



ra 2 ^ 



< 



J x 9 2 (xi) C g i\k - k 0j \ a n 2 g(xi) 



^2 



Hence, conditionally on (xi,x 2 , . . . ,x n ), 



n 



> 



TQ 



nja/2 

n 



2\k-k Qj \ 



a/2 



_1 ^2\9( x i)] 1 w j k{x i )ii 

8=1 

Now, consider the following two sets: 

tt v (xi,x 2 , ...,x n ) = < (xx,x 2 , ...,x n ): 



xi,x 2 , ■ ■ ■ ,x n \ < exp 



1 y^Jfc(^) 



and its complementary, f2£(xi, x 2 , . . . ,x n ). Then P 2 < -P21 + -P22 where 

TQ n 2^ 2 



T 2 2i a Inn 
'8n\k-k 0j \ a s 2 jk ^ 



>v}, 
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E 



> 



2\k-k 0j \ 



a/2 



1 ^2(9(xi)) l w jk {xi% 

i=l 

P 22 = E[I(Q v (x 1 ,x 2 ,. . . ,x n ))] 

and is the indicator of the set CI. Since for Cl^(xi,x 2 , ■ ■ ■ , x n ), we have 



xi,x 2 , ■ ■ ■ ,x n ] I(Cl c v (x 1 ,x 2 , ■ ■ .,x n )) 



n 



- 1 ^b(, ! )]" 1 ^(x i )< 1 + n" 1 



i=l 



5^|>(xi)] 1 w 2 fc (xi) - 1 



1=1 



and it is easy to check that 

P21 < exp(-r 2 lnn/D 2 ) with D 2 = SC^C W (v + 1). 



(9.26) 
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In order to find an upper bound for P22, we apply Bernstein's inequality with Zi = [g(xi)] 1 w 2 k (xi). 
Note that VarZi < C^C W ||u;|| 00 2J( Q+1 )|A: - k 0j \~ a , \\Z\loo < 2C^C W ||w|| 00 2^ a+1 )|A: - k 0j \~ a and 
KZi = 0. Application of (|9.23p with z = v, yields 

P 22 < 2exp(-i; 2 lnn/L>3) with D 3 = 2 |H|^ C^C W (1 + 2u/3). (9.27) 

Now, set u = 0.5 t[|io||oo and observe that for this value of v and r > 1, one has 

4 IM|~ 2 (! + 2W/3)" 1 z; 2 > t 2 /(v + 1) > t • min(l/2, (Ml" 1 ). 

To complete the proof, we only need to combine (|9.25p . (|9.26p and (|9.27p . 



Lemma 11 Let 6 = 0, C v = [2max(|L (/3 |, IL^I)]" and C g 2 be defined in \2.2\) . Let m be a non- 
random integer, mi < m < J — 1, and k G ^(f m - ^ en > / or c [ c[ m ^ given by \5.11\) and 
J5. 1J$ , respectively, and an arbitrary constant k > 1, one has 



~(m) (m) I 



-I 



-I 



>KQ n 2 2 \=0(n c« ) , n— > 00. 



(9.28) 



i^ere, £> n is defined in t9.20\) and C K is given by formula t6.9\) . 



Proof of Lemma 111! The proof is very similar to the proof of Lemma [TUl therefore, we 
shall just provide its outline. Partition the probability in (|9,28p into Pi and P2 with 



A 



_1 ^2<Pmk(Xi)f(Xi) - 4' 



(m) 



i=l 



mot 

>0.5KQ n 2 2 



1 ^2<Pmk(Xi)£i 



i=l 



ma. 

> 0.5 K Q n 2 2 



An upper bound for Pi, obtained by applying Bernstein's inequality, is of the form 
Pi < 2exp(-Klnn/D 4 ) with D4 = 8 ||/||oo m&x(2C g 2C ip , H^lloo/3). 
In order to derive an upper bound for P2, introduce a set 



(9.29) 



Q v (xx,X2, ...,x n ) = <(xi,x 2 ,...,x n ): 



1 n f 

i=l J 



> v2~ 



and its complementary, Q°(xi,x 2 , ■ ■ ■ ,x n ). Then, similarly to the proof of Lemma [TUl obtain P2 < 
P21 + P22, where 



21 



E 



< exp 



n 



1 y^¥mk(Xi)(,i 



i=l 



> 0.5 K Qn 2~ ma l 2 



xi,x 2 , . . . ,x n ) I(Q c v (xi,x 2 , ■ ■ .,x n )) 



k 2 Inn 



8(v + C V C 9 2) 

Also, application of (|9.23p with n = (^^{xi) — J ip 2 nk (x)g(x)dx) , yields 

nv 2 2~ m ( 1+a ) 



P22 = E[I(e c v ( Xl ,X2,...,x n ))] <2exp 



2y\\uc v c g2 + v/3) r 
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Setting v = an, noting that for any A, B,C > one has A/(b + c) > mm(A/(2B), A/(2C)), and 
recalling that k > 1 and ra 2~ m ( 1+a ) > Inn by (|5.3|) . we derive 

P 2 < 2 exp(-/c Inn/As) with £> 5 = max (l6a, l6C v C g2 , 4C ^ C 9 2 ^°° ; 4 IMI^ _ ( 9-30 ) 

To complete the proof, it suffices to note that a > is arbitrary. 

Lemma 12 Let 6 = 0, Ze£ uiq and rh be given by A5.18\) and A6.2\) , respectively. Consider the 
non- asymptotic finite dimension matrices A* and B* with elements 



A* kl = J cp(z + k 0m - k)<p(z + k 0m - l)\z\ a dz, k, I e K$ m , (9.31) 
B\ k = J \p(z + ko m -k)tp(z + k 0m -l)\z\ a dz : leK$ m ,keK£ m . (9.32) 



Let u( m ) be the solution of the system of equations ( f5. 15\) . 
If A > max (C\i, Cxi), then, as n — > oo, 



|uM _ EfiM || > A n 2 mQ / 2 J = O (n~& ) , (9.33) 
where Cxi and Cx2 are defined in 116. 5|) and C u is defined in (6.7). 

Proof of Lemma I12i Observe that for any m, by (|5. 15|) . one has 

||uM _ E uM|| < ||(A (m) ) -1 (c (m) - c (m) )|| + ||(A (m) ) _1 B (m) (v (m) — v^>)||, 

so that 

P^||uM -Eu( m )|| >X Qn 2 ma ^ < P^IKA^)- 1 ^™) -c( m ))|| >0.5A£ n 2 mQ / 2 
+ pr||(AM)-iBM(vH - v ( m ))|| > 0.5 A Q n 2 ma / 2 \ = P 1 + P 2 . 



Now note that, by assumption (I2.ip and the dominated convergence theorem, as n — > oo, 
one has A( m ) = C 5 2- ma A*(l + o(l)) and B( m ) = C g 2-' mQ B*(l + o(l)), where the matrices A* 
and B*, defined in (|9.31|) and (|9.32p . are independent of m, since the sets an< ^ -^Om are 

defined in terms of k - k 0m and I - k 0m . Therefore, ||(A^) -1 || = C~ 1 2 ma ||(A*) _1 ||(1 + o(l)) 
and IKAMj^BMll = (((A*)" 1 B*||(l + o(l)). Hence, setting k = C^ 1 A in LemmaHH where C X i 
is defined in (|6.5p . and taking into account that the set K^ m contains no more than U v — L v + 1 
indices, we obtain 

(m) _ _(m)i| ^ Cg2 A g n 2~ ma / 2 \ ^ / (m) _ (m) C g2 A g n 2" mct / 2 



2IKA*)- 1 !! ;^^"V Cfc " Cfc l> 2V^-^ + i||(A*)- 1 



^ P (| 5 M _ C M| > 2C- 1 Ae n 2~"W 2 ) = O (n- 2 ^-^ 1 )- 1 a 



Similarly, using Lemma [TOl with w = ip and C T given by (|6.8p . and recalling the definitions of v( m ) 
and v^" 1 ), one can derive an upper bound for P 2 as 

/ \ n r >ma/2 \ ( \ _ nmce/2 

P 2 < p ||vW- v W||> ^fL _ < VJ f i .- U > 



2||(A')-iB«|| / " j£ I 2^-L„ + l ||(A*)-1B> 

r 2(C.C, 2 )-i 
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which completes the proof of the lemma. 

Proof of Lemma |4) Note that by definition of rh, whenever rh > mo, there exists j > rriQ 
such that ||(/ mo - / i )I(S mo )|| 2 > \ 2 2^ a p 2 n , where g n is defined in ([930D . Therefore, 



J-i 

n>m ) < V i with V i = P (lK/^o " /i)I(Smo)f > A 2 2^ p 2 ) . (9.34) 

j=m 



Observe that since 

||(/ mo - /,)I(H mo )|| < IK/oj - /oj)I(H mo )|| + ||(/ cJ - / Clj )I(S mo )|| 

+ ||(/0,mo _ /o,m )I( S m )ll + II (/c,m ~ fc,m )K^m )\\, 

one has the following upper bound for Vj defined in (|9,34p : 

Vj < Vo,j,m + ^Oj'J + V c ,j,m Q + ^cj'J 

where, for any ?no < m < j, 

V ,j,m = IP (||(/0,m " / ,m)I(S mo )|| > 0.25 A 2 ja ^ 2 p n ) , 
Pc,j, m = P (||(/c,m " /c,m)I(H mo )|| > 0.25 A2^ 2 p n ) . 

Since supp(/o, m ) C E m E H mo for m > mo, one has 

||(/0,m " / ,m)I(H mo )|| 2 = ||(/ , m - / , m )I(H m )|| 2 = ||/ , m - / ,m|| 2 (9-35) 

< ||u( m ) - u( m )|| 2 + 2(U V - L v + l)^ 2 2~ w . 
Hence, by (|9.35p and Lemma \12\ since mo < m < j, one derives 

Vo,j,m < P (||u (m) - uM|| > .25 A 2^ 2 Pn - A^2(U V — L v + 1)2^ A = O (V^) . 

Now, let us consider the second term, V c ,j,m- Note that supp(ip mk ) and H mo have non-empty 
intersection if and only if k E K mtmo , where 

K mtmo = {k : 2 m - mo [min(L ¥ ,,L^) -U v \-U v <k- k 0m < 2 m " m ° [max(l^, tfy) - L v ] - L v ) . 
Hence, for m > mo, 

J—l oo 

ii(/ c , m -/ c , m )i(H mo )ii 2 < i| V ( m )-vM|| 2 + y, E (^-M 2 + E E b %- 

j'=m k (zK i j = J kGK 

Here, by (|9.57p . we have 

°° / 2s' 2s' \ 

E E b % - A22 ~ 2JS * = ( n"^+^ (Inn) 2^ J , 

j = J haK 

where s* is defined in (|9.56p . Also, 

(bfk ~ bfk) 2 < (b fk - b fk ) 2 I(\b fk - b fk \ > 0.5d2^/ 2 |fc - VI" Q/2 ) + b% 
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since IflSy/fcl > d2^ a / 2 \k-k 0j ,\~ a / 2 ) < I(\b j/k \ > 0.5d2^ a / 2 \k-k 0j/ \- a / 2 )+I(\b fk -b fk \ > 0.5d2^ a / 2 \k- 
koji\~ a / 2 ) and, for j > rriQ and n large enough, I(|6j'fc| > 0.5d2 ja / 2 |fc— k ji\~ a / 2 ) = 0. Denote Clu = 

max (| min(L v , L^) — 2t/ v ,|, | max([/ (/ ,, U$) — L v \) and observe that ify, mo C j& : |fc — fcqj'l < 2P'~ m °Ci J irj. 

Hence, using Cauchy inequality and (|9.56p . one obtains 

j-i 

V V fe 2 , fc < A 2 {2C LU ) [l - 2/p) + 2- 2mos '2- 2s *( m - mo l 

Combining all inequalities above, we derive that for any m > mo, 

|| (/cm - / c , m )I(H mo )|| 2 < ||v( m ) - v(™)|| 2 + A 2 2" 2Js * + A 2 (2C LU )^- 2 M + 2~ 2m « s ' 2~ 2s * '^"^ 

j-i 

+ Yl E fe- v*) 2][ (iVfc - Vfci > o.5^' Q / 2 |fc - vr a/2 )- 

j'=m hcfr .. 

Now, by Lemma [TUl with w = (p and w = ip, obtain 

P CJ> < P(||v( m ) -v( m )|| >0.25A2^/ 2 p n -,4 2 2" 2Js *+A 2 (2C L c/) (1 " 2/p) + 2- 2m ° s, 2- 2s *( m - m °)) 
J-i 

+ 2 E P ^J' fc " b 3'k\ > 0.5d2^/ 2 |fc - V|~ Q / 2 ) 

= O (n- {CTCx ° rlx ^j + O (n^'^d^j , 
which completes the proof. 

9.5 Proofs of the statements in Section [5t risk of the zero-affected part of the 
wavelet estimator 

Proof of Lemma [J3 Note that 

CO 

A 1 = ||E/ ^-/ ^|| 2 = ^ £ b%, A 2 = E||/^-E/ ^|| 2 = £ E(a mfc -a mfc ) 2 , 

where a mk = u k m ^ for k 6 i^Qm- F rom the characterization (I3.1D of Besov spaces, it follows that, for 
any k, one has b 2 k < A2~ 2]S ' , and, therefore, since the number of indices in the set K^ m is finite, 

Ai = Oljr 2~ 2js ' ] = O (2" W ) . (9.36) 



\3= m 

Now, consider A 2 . Let, as in Lemma O A^" 1 ) be the matrix with the entries given by (|5.9|) . 
D( m ) = v / diag(A( m )) and Q( m ) = (DW)- 1 aH(DW)- 1 . In the following proof, for the sake 
of clarity, we shall suppress the index m. Rewrite the systems of equations (15. 8p and (|5.15p . 
respectively, as 

QDu = D~ 1 c + D~ 1 e-D~ 1 Bv, Q D u = D 1 c - D 1 Bv, (9.37) 

so that 

u - u = D^Q^D^c - c) - D^Q^D^v - v) + D X Q X D (9.38) 
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Therefore, 

A 2 = E||u-u|| 2 = 0(A 2 i + A 22 + A 23 ), (9.39) 

with 

A 21 = EUD-^-^-^c - c) || 2 , A 22 = EUD-^^D-^Cv - v)|| 2 , A 23 = HD^Q^D- 1 ^ 2 . 

(9.40) 

By Lemma El one has 

D« > C2- mQ / 2 exp(-0.5 62 /3 ( m+1 )), 
and since D is the finite-dimensional diagonal matrix, the latter implies 

p- 1 !! = O (2 mQ! / 2 exp(0.5&2^ m+1 ))) . (9.41) 

Therefore, since the set K^ m is finite, by Lemma [5J one has 

E||D^(c-c)|| 2 = £ Var(4 m) )/4? = O (n" 1 ) , 

so that we derive 

A 21 = O (IID-^PHQ- 1 !! 2 EUD-^c - c) || 2 ) = O (V 1 2 ma exp(b2^ m+ ^)) . (9.42) 



In order to derive an upper bound for A 22 , note that from (|9.10p . (|9.40p and considerations 
above, it follows that 

A 22 = O (HD- 1 !! 4 !^- 1 !! 2 E||B(v - v)|| 2 ) = O (2 2ma exp(262 /3 ( m+1 )) E||B(v - v)| 

Sinceexp(-62 m/3 |z|- /3 ) is an increasing function of \z\ and, for < z + ko m — k<U v and k 6 if * m , 
one has \z\ < 2{U ip — L v ), for k £ i^om> we derive 

C k k = f ¥ 2 mk (x)g(x)l{2 m x - I E n s )dx = O (V mQ j \ 2 {z + k 0m - k) \z\ a exp {-62^1^} dz 

= O (V ma exp {-62 m ^- 1 )(^ - ^) _/3 }) • 

Hence, since the sets K^ m and K^ m are finite, by definition of vector v, Lemmas [6] and [8] and Cauchy 
inequality, we obtain 



E||B(v-v)|| 2 = ]T ]T B^BMCoyia^a^Kn- 1 ]T £ JmkiA hh y/C kk C u 
heK$ m k,ieK* m heK % m k,ieK* m 

= O (n~ l 2~ ma exp |&2 m/3 - b2 m ^~ 1 \u ip - L v )~ fi - b2 mfi M^}) , 
where is defined in Lemma Since U v — L v > 4, we finally obtain 



A 22 = O (n- 1 2 ma exp(62 m/3 [2 /3+1 + 1])J . (9.43) 

Now, for the function e m (x) defined in (|5.4p . one has 

A 23 = O (IID" 1 !! 2 HQ" 1 !! 2 IID^^H 2 ) (9.44) 
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where 

i 2 

-2 
x kk 



£m(x) ip* mk {x) g(x)dx 



(9.45) 



If b = 0, by Cauchy-Schwarz inequality, one obtains ||D 1 e|| 2 < Y2keKg AcfcH £m o|| 2 llVrnfeffll 2 ) 
where £ m o(x) = e m (x)l(\x — xq\ < C2~ m ). By calculations similar to proof of Lemma[8]in the case 
of b = 0, one can show that ||(/> TO fc#|| 2 = / (p 2 nk (x)g(x)dx = O (2~ 2ma ). Also, since b 2 k < A2~ 2js ' , 
one has 

(oo 
E E 
j=m\k-k 0j \<C2J- 



O J2 2-^ s '2^- m ^ 1 - 2 ^ ) = o(2 



-2ms' 



, 3=m 



Recalling ([9.1U|) . we obtain in the case of b = 0, 

A 23 = O (2- 2ms ' ) . (9.46) 



Now, let us consider the case of b > 0. Denote <f mk (x) = ip mk (x)I(2 m x — k S Ij mk i = 
J Vmk( X )^A X )9( X ) dx and let 

Zmax&mk^jl) = arg maxf^^)^^)^^)]- (9.47) 

Observe that since ^(zmax) / 0, we have Ij mk i/A k ™' = 0(1). Consider the collection of indices 

£ mj k = {l- < I < 2 j - 1, supp(</4 fe ) n supp(^) ^ 0} . 

It is easy to see that £ m jfc ^ [2 J_m (L (/ , + 5b + k) — U^p, 2 J ~ m (U ip + 5b + k) — L^], so, for each k, 
there are 0(2- ?_m ) terms such that I £ £ m jk- Note that |zmax(</?mfc' — l z max( ( / ? mfc)l and f° r 
each A;, there is only finite number of terms such that \z max ((ip* mk , ipji)\ = \z max ((ip* mk )\. Indeed, 
straightforward calculation shows that 

^max^mfci i>ji) = min^t/^ - 5 h + k- kom), 2 m ~ j (U i ( > + I - k j)] if k 0m - k < 0.5 (U v + L v ), 
and 

ZmaA i P*mki'*Pji) = max[(L^ + 5 6 + £; - k 0m ),2 m ~ j (L,,p + I - k 0j )] if fc 0m - A; > 0.5 (t/ v + L v ). 

Hence, |z max (^ fc , ^/)| = \z max (f mk )\ if Z > 2^ m ([^ - 5 b + k) - or I < 2^" m (L^ + <5 6 + fc) - L^. 
Since we also need I £ C mjk , we obtain that \z max (((p mk ,i()ji)\ = \z max {{(p mk )\ if / G where 

C [2^- m (L^+ ( 5 6 +fc)-lV,2 i - m (^+^^ 
and, thus, £* fe contains at most 2(U^ — L^) values of / for each k. If I £ £ m j k \ C* m j k = £ c m j k : then 
V- m {L v + 5 b + k)-L^<l< 2^ m (U^ -5 b + k)-U^. (9.48) 

Then, by (I9TT5D . 

Ijmki ~ y (p*(2 m - j t + fcom - k) ip(t + fcoi - 02" j ° exp(-6|tr /3 2 J/3 )(it 
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where 
and 



€& = Uf + l-k oj if k 0m - k < (U v + L v )/2 



t( mil = L^ + l- k 0j if k 0m — k> (U v + L v )/2. 
Using formula ()9.17|) . we derive that 



^ = ( 2 0'-)(^i/2) exp |_ fe2 ^ [| t CM)|-/3 _ a-O-^l^M)!-^ }) > (9.49) 
A kk 



where Zmax^ is defined in (|9.16p . 

Denote hj m M = |tmax| — 2^~ m ^\z^^ \ and observe that 



h jmk l = 2°" m) (U v -5 b + k)-U^-l if k 0m -k< (U v + L v )/2, 



and 



hjmkl = I ~ 2 (j " m) {L v + 5 b + k) + L 4 , if kom -k>(U ip + L v )/2. 

Comparing the latter formulae with definition of C> m j k , we derive that, for I G ^mjk> ® ^ hjmkl < 
C/ l 2- ?_m for every value of k, where Ch > is a constant which depends only on the choice of the 
wavelet basis. Now, for any < x < y and f3 > one has for some < £ < y — x 

x -f> - y-P = 0(y - x ){y - > 0(y - x)y-P. 

Applying the above inequality with x = |imax| and y = 2^~ m ^\z^^ ,S ^\, we obtain that 

\Jk,t)i-p _ o-(j-m)/3, (fc,fc,5),-/3 > o h . 2 -(j-m)P\ Jk,k,S)\-P 

I ''max I c rmax I — P'^mkl^ Kmax I ) 

and, thus, for I £ £mjk' we nave 



(-4 



(m)x_l 
kk ) 



Ijmki\ = O (2^-)^V2) e ^{-bp\z^\-^h jmkl }) . 



(9.50) 



Now, it follows from (|9.44p that A23 = A231 + A232, where 



A 



231 



^232 



o E 



E E (4* 

j=mleC* mjk 



(m)x-l 



Ljmkl 1 1 "j7 1 



E E (4a 

j=ml£C c ., 



(m)\-l 



7 



jmkl 1 1 "j7 1 



Using the facts thta the set -K^ m is finite, |6j/| = 0(2 JS and (^.^fe ) 1 \Ijmki\ = 0(1); we derive 
that, as m — > 00, 



A 23 i = 0(2 



(9.51) 



For A232, using (|9.50p and taking into account that hj m ki changes by unit increnments, we obtain 



A 



232 



°( E 

2 -2ms> 



OO L- h 



£ £ 2-i s '2(i--)(^V2) exp {_^|4W)|-y3 2 ^ Vw } 



j=mh jmk i=0 



(9.52) 
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Finally, combining expressions ([939]) . (|9lSj) . (|9~l3"j) . (f93T|) and (f9752]) . we obtain 

A 2 = (V 1 2 ma exp(2&2^ m+1 )) + 2- 2ms ') . 



(9.53) 



To complete the proof of (|5.17p . set m = mo, where ttiq is defined in (|5.18p and combine f|9.53j) with 
dEMD , (19391 . ©321 an d dS- 

Now, we need to show that E||/(5 m) - E/^H 4 = o(l). Note that it follows from (|Q7|l - (jO}l l 
that A* = O (AJ + Ag + A3) where, similarly to the case of squared difference, 

AJ = OOlD^fHCT 1 !! 4 E||c-c|| 4 ) , 

OOID" 1 !! 8 ||B|| 4 ||Q- 1 || 4 E||v-v|| 4 ), 



A* 
AS 



OdlD^fllQ^fllD-^H 4 ). 



Applying Lemma M and using ^7T2\i and (^^1]) with b = 0, we obtain AJ = O (n" 2 2 2mQ ) = o(l) 
and A* 2 = (2 2ma E||v - v|| 4 ). Also, similarly to (IQij) and (19^61) . A3 = O (V 4ms '). To complete 
the proof of the lemma, recall the definitions of v and v, apply (|9.5p with k G -?Q) m , and note that, 
for k S -f^om' one nas 1^ ~~ ^0m| = O(l). 



9.6 Proofs of the statements in Section [5t risk of the zero-free part of the 
wavelet estimator 



Proof of Lemma [2] . Let R = E||/ c>mo - f c>mo 
Ri = 



R\ + R 2 + R3 , where 
J-i 



E Var(a mofc ), R 2 = E E 6 2 fe , #3 = E E E(b jfc - b jk f. 



3=m keK * c 



By Lemma [6] we derive that, as n — > 00, 



Ri 



O 



n 



- 1 2 moa [|A ; -fco mo r a exp(62^|fc-A ; o mo r' 3 ) 



v 



= O ( T r 1 2 Tn °( 1+a > exp(2-( /3+1 ) Inn)) = o ( (lnn)~i 
Using (|5.3p and ()9.56p . we derive that 

fi 2 = O (2~ 2Js *) = O ( (lnn) _4 r ) = O ( (lnn)~7 



since s* = s' for 1 < p < 2 and s* = s > (s + 1/2 - l/p)/2 for 2 < p < 00 due to s > 1/2. For i? 3 , 
we have 

J-i J-i 

^3 = E E b h+H E Var (M 

J=mo |fc-fc 0j |<2^- m j=m \k-k 0j \>23- m 

fj-l 

= ° E [2" 2jV (2 J '~ mo ) 1 " 2/p + ^" 1 2 j 2 Qmo exp^™ 
y=m 

= O (V™o(s+2/ P -i) + (lnn)( 2+ "V/3 n -i+2"^ +1 ') = f(l nn yf 
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To complete the proof of the lemma, note that the upper bounds are uniform for / € Bp q (A). 
Proof of Lemma \S[ Note that 

R = EH/cm - fcf = Ri+R 2 + Rs + R4, (9.54) 

where 

00 

Ri = ^2 E(a mk - a mk ) 2 , R 2 = ^ ^ b 2 k , 
J-l 

-Cf 



>= m keK* c 



R s = E E e (b jk -b Jk ) 2 i(b 2 jk >d 2 en y a \k-h 



Oj 



J-l 

r 4 = E ^n^<dV2^-£^i 



with p n defined in (|9.2U|) . Using Lemma El we obtain 

R 1 = oln~ 1 2 ma I k - k 0m \~ a ) = o(n~ 1 2 ma (In n) I(a=1) ) (9.55) 



since the set contains O(lnn) terms and the sum ^feei^ 1^ ~~ ^0m| _a is uniformly bounded 

if a > 1. 

It is well-known (see, e.g., Johnstone (2002), Lemma 19.1) that if / € Z?p (.A), then for some 
constant c* > 0, dependent on p, q, s and A only, one has 



2^-1 



b 2 k < c*2- 2js * with s* = mines'). (9.56) 

k=0 

Therefore, an upper bound for R 2 is of the form 

00 2^-1 

^ = E E % = ( 2 " 2J • 

j=J fc=0 

/ 2/ 2/ \ 

If 1 < p < 2, then s* = s and i?2 = O I n 2s '+ Q (lnn) 2s '+ a J . If 2 < p < 00, then s* = s and, since 
s > 1/2, one has p > (4s - 2a - 2)/(4s 2 - a - 1). Hence, 

2s/(a + l) > 2s / /(2* / + a), 

so that, for 1 < p < 00, one has 



R 2 = O In (Inn)5?+5 1 . (9.57) 

In oder to obtain an upper bound for i?3 and R4, note that 

-R3 < -R31 + -R32, -R4 < -R41 + -R42, (9.58) 
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where 



J-i 



R31 = Y, E ^hhk-bjk) 2 H(hk-b jk ) 2 >0.25d 2 g n 2^\k-k oj \- a ] 



J-l 



fi 32 = E E E (hk -bjk) 2 Kb 2 jk > 0.25 d 2 e 2 n y a \k- k 0j [ 



i= m k & Kt 3C 
j-i 



(9.59) 



J-i 



^2 = E E ^% 2 fc <2.25d 2 ^2^|A ; -A;o j r 



Applying Lemma [10] with w(-) = tjj(-) and r = 0.5d, we obtain 

> 0.25 d 2 g 2 n 2 ja \k - k 0j \~ a ) = O L-°- 5d / Cd 



'jk ®jk, 

where Cd is given by (|5.20p . Hence, by Lemma [6] and inequality ^a + b < ^/a + Vb, for d > 4(7^, 
as n — >• 00, we obtain 



R ^ ^ E E [ E &* " 6 ^) 4 • F ^hk ~ b jk ? > 0.25 d 2 e 2 n 2?<* \k - k 0j \~ a ) 

■m ■ 
( 



1/2 



o 



V 



n _l 2 ^ ^ IJfe _^ r¥+n _ 123 . Q £ ljfe _ jfco , l . 



2s' 2s' 



Ojc 



= O I n 4C d ) = o ( n 2^+^ (lnn)^K 
Similarly, by (|9.56[) . 



R 
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o(n _! *) E E 6?* = o(n _1 : 



(9.60) 



(9.61) 



Now, consider i?32 and -R42. Note that it follows from Lemma [6] that 
(j-i 

R 32 = O \J2 E h~ l2JQ | fc - feojr a I(^fc > 0.5d 2 n" 1 lnn2 ja |A:- fc 0i |~ a )] 



(j-x \ 
Y E rninfClnn)- 1 ^^-^^-^-!-] 



and, similarly, 



i? 



12 



/j-i \ 
^ Yl min[6 2 fc ,n- 1 lnn2^|A;-A ; o j r1 
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Hence, 

#32 = 0(Qnn)- 1 R 42 ) = O (R A2 ) 
so that one needs to study only #42. Partition #42 as #42 = #421 + #422 + #423, where 

R ^ = E E [n-'hm^lk-ko^], R i22 =Yl E b %> 



(9.62) 



3=3* kGKt ]c 



#423 



32-1 

£ 

j=ji+i 



n- 1 lnn2 ja \k- k 0j \- a + b 



\k—k j\>Nj \k— k j\<Nj 

and the values of j±, j 2 and Nj will be defined later. It is easy to see that, by (|9,56[) . 

#421 = Ofn- 1 lnn2^ Q (lnn) I ( Q = 1 )) , #422 = O (2"^-*) , 
/ ia-l r 

#423 = O E [ n_1 lnn2^ 1 - a (lnn) I ( Q=1 )+2- 2 ^'iV j 1 - 2/p 
If a 7^ 1, the two terms in (|9.64|) are equal to each other when 

l/(«-2/p) 



(9.63) 
(9.64) 



A7j = (n- 1 lnn2 J '( 2s ' +Q ) 



and, for this value of Nj, one has 



# 



423 



J ^__J 2/p-l 2j(s'-as) 

O 2^ (n/lnn)«- 2 /p2 «- 2 /p 
u'=ii+i 



(9.65) 



Therefore, #423 behave differently when as > s' and as < s', and we consider those cases separately. 
First, consider the case when as = s' . Then 



2/p-l 



O ( (j'2 — ji){n/ \D.n)"^p I = O ( (lnn/n) 2 »'+« Inn I if as = s' 



#423 

If a > 1, as > s', choose j\ and j'2 such that 

1 . 3' 

2 J1 = (n/ln?!)^ 7 ^, 2 J2 = (nj In n) ?*W+S) . 

Note that if 1 < p < 2, one has s* = s > s', so that j'2 < j\ and #423 = 0. If 2 < p < 00, 
then j 2 > ji. Also, it follows from ([9.630 and ([9.650 that #423 = O n Q - 2 /p (lnn) a - 2 /f2 a ~ 2 /p 



Hence, #421 = O y(n/\nn) 2s'+ a j ; _R 422 = Q ^(n/lnn) 2s '+« J and #423 = O ^(n/lnn) 2 ^ T +« 
so that 

i? 42 = O ( (ra/lnn)" 2 ^ J if as > s', a > 1. (9.66) 



Similarly, if a > 1, as < s', choose ji and j 2 such that 

2^ = (n/lnn)^ 2 ^ 1 ), 2 J2 = (ji/lnn) 2 ^. 
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2/p— 1 1-g 2j2(s' —as) ' 

In this case, -R423 = O ( n a - 2 /p (In n) <*- 2 /p 2 «- 2 /p ), and direct calculations yield R421 = O [(n/krn) 



(9.67) 



-R422 = O ( (n/lnn) 2 S +i j anc j ^ 423 = Q (Jn/lnn) 2s+1 J, so that 

i? 42 = O ((n/lnn) - ^ J if as < s', a > 1. 



Finally, if a = 1, set ji = j'2 such that 



2 jl = (n/ln 2 n)^+r 



and obtain 



4s* -1 



i? 42 = O n (lnn) 2s *+! if a = 1. 



(9.68) 



Now, to complete the proof of (EOTI one just need to combine (|93Ij) . (|9Jt5|L (19371) . dflTBO)) , (^THTj) . 
(|9.62p and ()9.66|) (|9.68[) . and to note that all upper bounds are uniform for / G Z?* (.A). 

In order to prove (|5.22|) . note that 

R* = nkrn ~ fcf <Rl+R* 2 + Rt 



where 



^ 00 

R{ = O j E|| ^ (a mfc - a mfc )99 mfc (2;)|| 4 J , R* 2 = O || E E bjk^jk(x) 



Rt = O E 



j-i 



E E &-6ifc) 2][ (^A>rf 2 ft» 2, ' a i*-*oir 



Observe that, by Lemma[6l since 2 m ( a+1 ) = o(n/lnn), as n — > 00, 



R\ = O 2 m ^ E(a mfc - a mfc ) ' 

= 0(V 2 2 m ( 2Q+1 )) =o(l). 
For i?2) by (|9.56p . we have 



R* = O 



( 


00 






E E * 




\ 







O {n~ 3 2 m ( 3Q + 2 ) + n~ 2 2 m ( 2a+1 )) 



O 



-Iras' 



O(l). 



Finally, similarly to (|9.59p . partition R\ as R\ = R^ t + R\ 2 with i?^ and i?g 2 corresponding 
to l{\b jk - b jk \ 2 > 0.25d 2 g n 2^ a \k - k j\~ a ) and l{b 2 k > 0.25 d 2 g n 2^ a \k - k 0j \~ a ), respectively. 
For R31, applying Lemmas [61 and [TOl with w = ip and given by (|5.20p . and also noting that 
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SfepA"* 1^ ~~ ' = ^0-) ^ or ^ > 1 5 we derive 



Ojc 



^31 = O 



/ 7-1 \ 

n E E E [l^'fe- & ife| 4l (l^- 6 ifc| 2 >0.25d 2 g n 2^\k-k 0j \- a ) 
/ J-i 



O 



2/3 



F(|6 jfe - 6 jfc | 2 > 0.25 g n 2^ \k - k 0j \- a ) 



1/3 



J-l 



n 



-d/(3C d ) 



n 



-10/3 2 i(10a+4)/3 + n -8/3 2 j(8a+2)/3 + ^-2 ^ja 



o (n 1-d/(3C<i) ) = o(l), n->-oo, 



since tf > 3C^. For i?3 2 , using Lemma [U] and (|9.56|) . we derive that 
/ J-i 

/ J-i \ / J-i 



R* 2 = O 



O 



n E [Vte- 3 nb% + \n- 2 nb 4 jk ) 



o n 



^[2J(i-6»') + 2- 



since n -1 ^!*; - fc(y|~ Q < 0.25 6 2 fc /(d 2 Inn). Note that m > m implies 2 m > n 1 /( 2 «'+ a ), so that 



R 



32 



6s' -1 4a' 



o n + n 2s'+ a — (i) ? 



which completes the proof of the lemma. 



9.7 Proof of the minimax upper bounds for the risk in Section [6] 

Proof of Theorem [2l Since rh = uiq for 6 > 0, the validity of Theorem [2] for b > follows 
directly from Lemma [2j For 6 = 0, observe that 



A = E[||/ m -/|| 2 = J2 n\\L-f\\ 2 Km = m<m )]+E[\\f m - f\\ 2 I(m = m > m )] = A 1 + 



m=mi 



and consider terms Ai and A2 separately. 
Denote 

O (n~2^TT (lnn)^ 1 ] if 6 = 0, as < s', 

R(n) = { > _ 2s' \ 

O I n (lnn)^ 2 ) if 6 = 0, as > s', 

and note that for any m > m\ 

ML - ft < 2[E||/ mo - /|| 2 +E||(/ m - / mo )I(x G ~ m )|| 2 + E||(/ m - / mo )I(x G 



(9.69) 



4)H 2 ] 
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where mi is defined in ()5.3[) and set H m is defined in ()6. 1 1) . By Lemmas [Q and [3j we obtain 

nfmo - ff = O (rT*b + R(n) 
If m = m < mo, then by definition of m, we derive that 

2s' 



n\(L - fmo)I(x e ~ m )|| 2 < A 2 2 moa n- 1 lnn = O f tT 

Now, recall that H m is defined in such a way that supp(/o jm ) G H m for any m and Eji C Hj2 for 
Ji > J2) so that for m < mo one has 

E||(/ m -/)I(xGH^)|| 2 = E||(/ , m + / Cim -/o im -/ c , m )I(xG^)|| 2 

= E||(/ C , m - f c , m )I(x G ~^)|| 2 < E||/ C , m - / c , m || 2 = O (fl(n)) 

as n — > oo. Noting that 



mfm - fm )Hx G E c m )\\ z < 2 [E\\(f m - f)l(x G ~^)|| 2 + E||(/ mo - f)l(x G H c 

and combining all formulae above, we obtain that Ai = O (R(n)) as n — > oo. 

By Lemmas H] and El one has E||/ , m - /o,m|| 4 = o(l) and E||/ C , m - / c , m || 4 = o(l). Then, 
Lemma U] yields 

A 2 < \/E[||/ m - /|| 4 y/F(m = m>m ) = O (n~^ + „3tsiTij-45^ _ 0(n _1 ) 

provided A > max (C\i, C\2, 2C>) and d > 2(a + l) _1 (2a + 3)C<i, which completes the proof of 
Theorem [2j 

9.8 Proofs of the statements in Section \7\ 

Proof of Theorem [3] is based on the following lemma. 

Lemma 13 Let Assumption A hold with 6 = and < a < 1. Then, 

Var(b jk ) = O (n -1 2 J ' Q min(l, |fc - A^T")) , 

q + 3 / 2 -■ fa+g) a + 3 A 

E|6j fe - 6 ife |^+T = O n ^+12^^+1) +n 2 («+ 2 > 2 J ) 

Proof of Lemma 1131 Proof of the first statement is very similar to the proof of validity of 
formula (19.6H . Proof of the second statement is based on Lemma 3.1. in Chesneau (2007a) which 
states that whenever J[g(x)] 1 ~' y dx < oo for some v > 2, one has 

nhk-bjkl" = °{ nl ~ V j \^k{x)V[g(x)] l -»dx + n- v l 2 f ^ k (x)[g(x)]-^ 2 dx^ . (9.70) 
To complete the proof, note that for v = 1 + 2/(a + 1) > 2 one has f[g(x)] 1 ~ u dx < oo and apply 

flEQD - 
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Proof of Theorem [3j Proof of this statement is similar to the proof of Lemma Indeed, 
similarly to the proof of Lemma El partition the risk as R = E||/ — /|| 2 = R\ + R 2 + R3 + R4 where, 
similarly to the proof of Lemma O 



Ri 



R* 



R 1 



2 m i-l 00 2-7-1 

- vi:) 2 i R2 = E E b %> 



j=J k=0 



E E(a mi 

k=0 
J-l 2J -1 

E E E [fe " b i*) 2 > d2 ^ 2'° I* - 

j=0 fc=0 
J-l 2J -1 

E E ^ d2 ^ v a \k - k 0j \~ a ) 

j=0 k=0 



Oj 



with g n defined in (|9.20p . Since 1/g is integrable and mi in (|5.3p is finite, it is easy to show that 
i?i = O (n" 1 ). Also, same as before, R 2 = (2~ 2Js *) . If p > 2, then a + 1 < 2s + 1 since 
s > max(l/2, 1/p) and a < 1, so that i? 2 = O (n" 2s /( 2s+1 )) . If 1 < p < 2, then s* = s' and 
2s'/(l + a) > max {2s'/(2s' + a), 2s/(2s + 1)}, so that 



R 2 = (max{n- 2s /( 2 * +1 ),n- 2s '/(2s'+") }) 



Now, similarly to the proof of Lemma[3l partition R3 and -R4 as R3 < R31+R32 and R4 < -R41 + i?42- 
Using Lemma [T3l as n — > 00, obtain upper bounds 



J-l 2 J -1 

^ < EE [ : 



'jk-bjkY > 0.25 d 2 ^2^|fc-fc 0j 

j=0 fc=0 
/j-l 

O ^2 i n- d(1 - 2/ ^ )/(2C ' [i) 
\i=o 



1-2/v 



2/i/ 



= O ^2 J ' n -rf(l-2/^)/(2C d ) 

provided (|7.2p holds, and also 

i?41 



2/i/ 



2/1/ 



n 



-1//2 2^ 



On 



o(n -a ^) E E 6?* = o(n _1 : 



Now, same as before, .R32 = O ((Inn) 1 i?42) = O (-R42), so that we need to construct upper 
bounds for i?42 only. Partition i?42 as .R42 = -R421 + -R422 + R423 where 

ji 2J-1 J-l 2-7-1 

R ^ = E E [ n " x ln ^ 2ia i fc - M~1 , ^22 = E E b h> 

j=0 k=0 j=j 2 k=0 

J2-1 



i? 



423 



E E 

i=ii+i ^ |fc-fe j|>iVj 



n 



lnn2 ja iV; 



l-p/2 



+ E n " 

\k-k 0j \<Nj 



1 Inn 2^^"" 



and the values of ji, j'2 and Nj will be defined later. It is easy to see that, same as before, 
-R421 = O (n _1 lnn2 Jia ) and -R422 = O (2~ 2j2S *). For -R423 we can write the following expression 



R 



423 



J2-1 

£ 



2~3 S V 



Inn 



n 



2 ia N~ 



l-p/2 



Inn 



+ V a N 



a A7"l— ck 



n 
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If p > 2, we choose j\ = j'2 such that 2-? 1 = (lnn/n) 1 /( 2s+1 ) and obtain R42 = O ((lnn/n) 2s ^ 2s+1 ^) . 
If 1 < p < 2, we choose Nj which equalize the two terms in -R423 and obtain, similarly to (|9.65|) . 



R 



423 



/ 2/p-l 2j 2 (s'-as) 

O I (n/lnn)— 2 /p2 q - 2 A> 

/ 2/p-l 2j 1 (s'-as) 

(n/lnn)— 2 /p2 a ~ 2 /p 



if as < s' 
if as > s' 



/ 2/p-l \ 

I U2 — h) (n/ln7T,) Q - 2 /p J if as = s 



If as < s', then choose 

2 jl = (ra/lnn) 2 ^, 2 j2 = (n/lnn) 8 '^ 1 ) 
so that ji < j'2. Direct calculations show that in this case 



R 



12 



o((n/lnn) Cl ) with Ci = — -4 



2(s' - as) 



2s 



2/p-a (2s + l)(2/p-a) 2s + 1 
and i? 42 = O ((lnn/n) 2s / (2s+1 )). If as > s', then set 

a 1 

2 n = (n/lnn)2^+^, 2 J2 = (ji/lnn) 2 ^ 7 ^, 
so that again j\ < Here we have 

2/p-l 2(as-s') 



R 



12 



O ((n/lnn) C2 ) wii/i C2 



2s' 



2/p-a (2s' + a)(2/p-a) 2s' + a 



(9.71) 



and i?42 = O ^(lnn/n) 2s '/( 2s ' +Q) y If as = s', then note that j 2 - ji = 0(lnn), so that -R42 = 

O ^(lnn/n) 2s '/( 2s ' + °'j = O ((lnn/n) 2s /( 2s+1 )) . Now, to complete the proof, just combine the ex- 
pressions for Ri, R 2 , -R31, R41, i?32 and R^- 
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